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< Description 

[0001] The present invention relates to single chain multivalent antibodies. 

[0002] Antibodies are proteins belonging to a group of immunoglobulins elicited by the immune system in response 
5 to a specific antigen or substance which the body deems foreign. There are five classes of human antibodies, each 
class having the same basic structure. The basic structure of an antibody is a tetramer, or a multiple thereof, composed 
of two identical heterodimers each consisting of a tight and a heavy chain. The light chain is composed of one variable 
(V) and one constant (C) domain, while a heavy chain is composed of one variable and three or more constant domains. 
The variable domains from both the light and heavy chain, designated V L and V H respectively, determine the specificity 
10 of an immunoglobulin, while the constant (C) domains carry out various effector functions. 

[0003] Amino acid sequence data indicate that each variable domain comprises three complementarity determining 
regions (CDR) flanked by four relatively conserved framework regions (FR). The FR are thought to maintain the struc- 
tural integrity of the variable region domain. The CDR have been assumed to be responsible for the binding specificity 
of individual antibodies and to account for the diversity of binding of antibodies. 
is [0004] As the basic structure of an anti body contains two heterodimers, antibodies are multivalent molecules. For 
example, the IgG classes have two identical antigen binding sites, while the pentameric IgM class has 10 identical 
binding sites. 

[0005] Monoclonal antibodies having identical genetic parentage and binding specificity have been useful both as 
diagnostic and therapeutic agents. Monoclonal antibodies are routinely produced by hybridomas generated by fusion 
20 of mouse lymphoid cells with an appropriate mouse myeloma cell line according to established procedures. The ad- 
ministration of murine antibodies for in vivo therapy and diagnostics in humans is limited however, due to the human 
anti-mouse antibody response illicited by the human immune system. 

[0006] Chimeric antibodies, in which the binding or variable regions of antibodies derived from one species are 
combined with the constant regions of antibodies derived from a different species, have been produced by recombinant 

25 DNA methodology. See, for example, Sahagen etaL.J. Immunol., 137 : 1066-1074(1986); Sunetal., Proa Natl. Acad. 
ScL USA, 82: 214-218 (1987) ; Nishimura et al., Cancer Res., 47:999-1005 (1987) ; and LieetaL ProcNatl. Acad. Sci. 
USA, 84: 3439-3443 (1987) which disclose chimeric antibodies to tumor-associated antigens. Typically, the variable 
region of a murine anti body is joined with the constant region of a human antibody. It is expected that as such chimeric 
antibodies are largely human in composition, they will be substantially less immunogenic than murine antibodies. 

30 [0007] Chimeric antibodies still carry the Fc regions which are not necessary for antigen binding, but constitute a 
major portion of the overall antibody structure which affects its pharmacokinetics. For the use of antibodies in immu- 
notherapy or immunodiagnostics, is it desirable to have antibody-like molecules which localize and bind to the target 
tissue rapidly and for the unbound material to quickly clear from the body Generally, smaller antibody fragments have 
greater capillary permeability and are more rapidly cleared from the body than whole antibodies. 

35 [0008] Since it is the variable regions of light and heavy chains that interact with an antigen, single chain antibody 
fragments (scFvs) have been created with one V L and one V H , containing all six CDR's, joined by a peptide linker (U. 
S. Patent 4,946,778) to create a V L -L-V H polypeptide, wherein the L stands for the peptide linker. A scFv wherein the 
V L and V H domains are orientated V H -L-V L is disclosed in U.S. Patent 5,132,405. 

[0009] As the scFvs have one binding site as compared to the minimum of two for complete antibodies, the scFvs 
40 have reduced avidity as compared to the antibody containing two or more binding sites. 

[0010] It would therefore be advantageous to obtain constructions of scFvs having more than one binding site to 
enhance the avidity of the polypeptide, and retain or increase their antigen recognition properties. In addition, it would 
be beneficial to obtain multivalent scFvs which are bispecific to allow for recognition of different epitopes on the target 
tissue, to allow for antibody-based recruitment of other immune effector functions, or allow antibody capture of a ther- 
45 apeutic or diagnostic moiety. 

[0011] It has been found that single chain antibody fragments, each having one V H and one V L domain covalently 
linked by a first peptide linker, can be covalently linked by a second peptide linker to form a multivalent single chain 
antibody which maintains the binding affinity of a whole antibody. In one embodiment, the present invention is a mul- 
tivalent single chain antibody having affinity for an antigen wherein the multivalent single chain antibody comprises 
so two or more light chain variable domains and two or more heavy chain variable domains; wherein, each variable domain 
is linked to at least one other variable domain. 

[0012] In another embodiment, the present invention is a multivalent single chain antibody which comprises two or 
more single chain antibody fragments, each single chain antibody fragment specifically binding an antigen, wherein 
the single chain antibody fragments are covalently linked by a first peptide linker which contains an amino acid sequence 
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Leu Ser Ala Asp Asp Ala Lys Lys Asp Ala Ala Lys Lys Asp 
Asp Ala Lys Lys Asp Asp Ala Lys Lys Asp Leu 

5 

and each single chain antibody fragment comprises 

(a) a first polypeptide comprising a light chain variable domain; 

(b) a second polypeptide comprising a heavy chain variable domain; and 

10 (c) a second peptide linker linking the first and second polypeptides into a functional binding moiety. 

[0013] In another embodiment, the invention provides a DNA sequence which codes for a multivalent single chain 
antibody, the multivalent single chain antibody comprising two or more single chain antibody fragments, each fragment 
having affinity for an antigen, wherein the fragments are covalently linked by a first peptide tinker which contains an 
*s amino acid sequence of 



Leu Ser Ala Asp Asp Ala Lys Lys Asp Ala Ala Lys Lys Asp Asp Ala Lys 

20 

Lys Asp Asp Ala Lys Lys Asp Leu 

and each fragment comprising 

25 

(a) a first polypeptide comprising a light chain variable domain; 

(b) a second polypeptide comprising a heavy chain variable domain; and 

(c) a second peptide linker linking the first and second polypeptides into a functional binding moiety. 

30 [001 4] The multivalent single chain antibodies allow for the construction of an antibody fragment which has the spe- 
cificity and avidity of a whole antibody but are smaller in size allowing for more rapid capillary permeability. Multivalent 
single chain antibodies also allow for the construction of a multivalent single chain antibody wherein the binding sites 
can be two different antigenic determinants. 

35 BRIEF DESCRIPTION OF THE DRAWINGS 

[0015] Figure 1 illustrates covalently linked single chain antibodies having the configuration V L -L-V H -L-V L -L-V H 
(LHLH) and V L -L-V H -L-V H -L-V L (LHHL) and a noncovalently linked Fv single chain antibody (Fv2). 
[0016] Figure 2 illustrates the nucleotide sequence of CC49 V L (SEQ ID NO: 1 ). 
40 [0017] Figure 3 illustrates the amino acid sequence of CC49 V L (SEQ ID NO: 2). 
[0018] Figure 4 illustrates the nucleotide sequence of CC49 V H (SEQ ID NO: 3) . 
[0019] Figure 5 illustrates the amino acid sequence of CC49 V H (SEQ ID NO:4). 

[0020] Figure 6 illustrates the nucleotide sequence and amino acid sequence of the CC49 single chain antibody 
LHLH in p49LHLH (SEQ ID NO:6). 
45 [0021] Figure 7 illustrates the nucleotide sequence and amino acid sequence of the CC49 single antibody LHHL in 
p49LHHL (SEQ ID NO:8). 

[0022] Figure 8 illustrates construction of plasmids pSL301 T and pSL301 HT 
[0023] Figure 9 illustrates construction of plasmid p49LHHL. 
[0024] Figure 10 illustrates construction of plasmid p49LHLH. 
so [0025] Figure 11 illustrates the results of a competition assay using CC49 IgG, CC49 scFv2, and CC49 scFv using 
biotinylated CC49 IgG as competitor. 

[0026] The entire teaching of all references cited herein are hereby incorporated by reference. 
[0027] Nucleic acids, amino acids, peptides, protective groups, active groups and such, when abbreviated, are ab- 
breviated according to the iUPAC IUB (Commission on Biological Nomenclature) or the practice in the fields concerned. 
55 [0028] The term "single chain antibody fragment" (scFv) or "antibody fragment" as used herein means a polypeptide 
containing a V L domain linked to a V H domain by a peptide linker (L), represented by V L -L-V H . The order of the V L and 
V H domains can be reversed to obtain polypeptides represented as V H -L-V L . "Domain" is a segment of protein that 
assumes a discrete function, such as antigen binding or antigen recognition. 
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[0029] A "multivalent single chain antibody" means two or more single chain antibody fragments covalently linked 
by a peptide linker. The antibody fragments can be joined to form bivalent single chain anti bodies having the order of 
the V L and V H domains as follows: 

V L -L-V H -L-V L -L-V H ; V L -L-V H -L-V H -L-V L ; V H -L-V L -L-V H -L-V L ; or V H -L-V L -L-V L -L-V H . 
5 Single chain multivalent antibodies which are trivalent and greater have one or more anti body fragments joined to a 
bivalent single chain antibody by an additional interpeptide linker. In a preferred embodiment, the number of V L and 
V H domains is equivalent. 

[0030] The present invention also provides for multivalent single chain antibodies which can be designated V H -L- 
V H -L-V L -L-V L or V L -L-V L -L-V H -L-V H . 
10 [0031] Covalently linked single chain antibodies having the configuration V L -L-V H -L-V L -L--V H (LHLH) and V L -L-V H - 
L-V H -L-V L (LHHL) are illustrated in Figure 1. A noncovalently linked Fv single chain antibody (Fv2) is also illustrated 
in Figure 1 . 

[0032] The single chain antibody fragments for use in the present invention can be derived from the light and/or 
heavy chain variable domains of any antibody. Preferably, the light and heavy chain variable domains are specific for 
15 the same antigen. The individual antibody fragments which are joined to form a multivalent single chain antibody may 
be directed against the same antigen or can be directed against different antigens. 

[0033] To prepare a vector containing the DNA sequence for a single chain multivalent antibody, a source of the 
genes encoding for these regions is required. The appropriate DNA sequence can be obtained from published sources 
or can be obtained by standard procedures known in the art. For example, Kabat et al., Sequences of Proteins of 
20 Immunological Interest 4th ed., (1 991 ), published by The U.S. Department of Health and Human Services, discloses 
sequences of most of the anti body variable regions which have been described to date. 

[0034] When the genetic sequence is unknown, it is generally possible to utilize c DNA sequences obtained from 
mRNA by reverse transcriptase mediated synthesis as a source of DNA to clone into a vector. For antibodies, the 
source of mRNA can be obtained from a wide range of hybridomas. See, for example, the catalogue ATCC Cell Lines 
25 and Hybridomas, American Type Culture Collection, 20309 Parklawn Drive, Rockville Md. t USA (1990). Hybridomas 
secreting monoclonal antibodies reactive with a wide variety of antigens are listed therein, are available from the col- 
lection, and usable in the present invention. These cell lines and others of similar nature can be utilized as a source 
of mRNA coding for the variable domains or to obtain antibody protein to determine amino acid sequence of the mon- 
oclonal antibody itself. 

30 [0035] Variable regions of antibodies can also be derived by immunizing an appropriate vertebrate, normally a do- 
mestic animal, and most conveniently a mouse. The immunogen will be the antigen of interest, or where a hapten, an 
antigenic conjugate of the hapten to an antigen such as keyhole limpet hemocyanin (KLH). The immunization may be 
carried out conventionally with one or more repeated injections of the immunogen into the host mammal, normally at 
two to three week intervals. Usually, three days after the last challenge, the spleen is removed and dissociated into 

35 single cells to be used for cell fusion to provide hybridomas from which mRNA can readily be obtained by standard 
procedures known in the art. 

[0036] When an antibody of interest is obtained, and only its amino acid sequence is known, it is possible to reverse 
translate the sequence. 

[0037] The V L and V H domains for use in the present invention are preferably obtained from one of a series of CC 
40 antibodies against tumor-associated glycoprotein 72 antigen (TAG-72) disclosed in published PCT Applicadon WO 
90/04410 on May 3, 1990, and published PCT Application WO 89/00692 on January 26, 1989. More preferred are the 
V L and V H domains from the monoclonal antibody designated CC49 in PCT Publications WO 90/04410 and WO 
89/00692. The nucleotide sequence (SEQ ID NO: 1 ) which codes for the V L of CC49 is substantially the same as that 
given in Figure 2. The amino acid sequence (SEQ ID NO: 2) of the V L of CC49 is substantially the same as that given 
45 in Figure 3. The nucleotide sequence (SEQ ID NO: 3) which codes for the V H of CC49 is substantially the same as 
that given in Figure 4. The amino acid sequence (SEQ ID NO: 4) for the V H of CC49 is substantially the same as that 
given in Figure 5. 

[0038] To form the antibody fragments and multivalent single chain antibodies of the present invention, it is necessary 
to have a suitable peptide linker. Suitable linkers for joining the V H and V L domains are those which allow the V H and 

50 v L domains to fold into a single polypeptide chain which will have a three dimensional structure very similar to the 
original structure of a whole antibody and thus maintain the binding specificity of the whole antibody from which antibody 
fragment is derived. Suitable linkers for linking the scFvs are those which allow the linking of two or more scFvs such 
that the V H and V L domains of each immunoglobulin fragment have a three dimensional structure such that each 
fragment maintains the binding specificity of the whole antibody from which the immunoglobulin fragment is derived. 

55 Linkers having the desired properties can be obtained by the method disclosed in U.S. Patent 4,946,778, the disclosure 
of which is hereby incorporated by reference. From the polypeptide sequences generated by the methods described 
in the 4,946,778, genetic sequences coding for the polypeptide can be obtained. 

[0039] Preferably, the peptide linker joining the V H and V L domains to form a scFv and the peptide linker joining two 



4 



EP 0 628 078 B1 



or more scFvs to form a multivalent single chain antibody have substantially the same amino acid sequence. 
[0040] It is also necessary that the linker peptides be attached to the antibody fragments such that the binding of the 
linker to the individual antibody fragments does not interfere with the binding capacity of the antigen recognition site. 
[0041] A preferred linker is based on the helical linker designated 205C as disclosed in Pantoliano et al. Biochem., 
s 30 10117-10125 (1991 ) but with the first and last amino acids changed because of the codon dictated by the Xho I site 
at one end and the Hind III site at the other. The amino acid sequence (SEQ ID NO: 5) of the preferred linker is as follows: 

Leu-Ser-Ala-Asp-Asp-Ala^ys-Lys-Asp-AIa-Ala-Lys-Lys-Asp-Asp-Ala-Lys-Lys-Asp-Asp-Ala- 
10 -Lys-Lys-Asp-Leu. 

[0042] The linker is generally 10 to 50 amino acid residues. Preferably, the linker is 10 to 30 amino acid residues. 
More preferably the linker is 12 to 30 amino acid residues. Most preferred is a linker of 15 to 25 amino acid residues. 
[0043] Expression vehicles for production of the molecules of the invention include plasmids or other vectors. In 

is general, such vectors contain replicon and control sequences which are derived from species compatible with a host 
cell. The vector ordinarily carries a replicon site, as well as specific genes which are capable of providing phenotypic 
selection in transformed cells. For example, E. coli is readily transformed using pBR322 [Bolivar et al., Gene, 2, 95- 
(1977), or Sambrook et al, Molecular Cloning, Cold Spring Harbor Press, New York, 2nd Ed. (1989)]. 
[0044] Plasmids suitable for eukaryotic cells may also be used. S. cerevisiae, or common baker's yeast, is the most 

20 commonly used among eukaryotic microorganisms, although a number of other strains, such as Pichia pastoris, are 
available. Cultures of cells derived from multicellular organisms such as SP2/0 or Chinese Hamster Ovary (CHO), 
which are available from the ATCC, may also be used as hosts. Typical of vector plasmids suitable for mammalian 
cells are pSV2neo and pSV2gpt (ATCC); pSVL and pKSV-10 (Pharmacia), pBPV-1/pML2d (International Biotechnol- 
ogy, Inc.). 

25 [0045] The use of prokaryotic and eukaryotic viral expression vectors to express the genes for polypeptides of the 
present invention is also contemplated. 

[0046] It is preferred that the expression vectors and the inserts which code for the single chain multivalent antibodies 
have compatible restriction sites at the insertion junctions and that those restriction sites are unique to the areas of 
insertion. Both vector and insert are treated with restriction endonucleases and then ligated by any of a variety of 

30 methods such as those described in Sambrook et al., supra. 

[0047] Preferred genetic constructions of vectors for production of single chain multivalent antibodies of the present 
invention are those which contain a const it utively active transcriptional promoter, a region encoding signal peptide 
which will direct synthesis/secretion of the nascent single chain polypeptide out of the cell. Preferably, the expression 
rate is commensurate with the transport, folding and assembly steps to avoid accumulation of the polypeptide as 

35 insoluble material. In addition to the replicon and control sequences, additional elements may also be needed for 
optimal synthesis of single chain polypeptide. These elements may include splice signals, as well as transcription 
promoter, enhancers, and termination signals. Furthermore, additional genes and their products may be required to 
facilitate assembly and folding (chape rones). 

[0048] Vectors which are commercially available can easily be altered to meet the above criteria for a vector. Such 
40 alterations are easily performed by those of ordinary skill in the art in light of the available literature and the teachings 
herein. 

[0049] Additionally, it is preferred that the cloning vector contain a selectable marker, such as a drug resistance 
marker or other marker which causes expression of a selectable trait by the host cell. "Host cell" refers to cells which 
can be recombinantly transformed with vectors constructed using recombinant DNA techniques. A drug resistance or 
45 other selectable marker is intended in part to facilitate in the selection of transformants. Additionally, the presence of 
a selectable marker, such as a drug resistance marker, may be of use in keeping contaminating microorganisms from 
multiplying in the culture medium. In this embodiment, such a pure culture of the transformed host cell would be obtained 
by culturing the cells under conditions which require the induced phenotype for survival. 

[0050] Recovery and purification of the present invention can be accomplished using standard techniques known in 
50 the art. For example, if they are secreted into the culture medium, the single chain multivalent antibodies can be con- 
centrated by ultrafiltration. When the polypeptides are transported to the periplasmic space of a host cell, purification 
can be accomplished by osmotically shocking the cells, and proceeding with ultrafiltration, antigen affinity column 
chromatography or column chromatography using ion exchange chromatography and gel filtration. Polypeptides which 
are insoluble and present as retractile bodies, also called inclusion bodies, can be purified by lysis of the cells, repeated 
55 centrifugation and washing to isolate the inclusion bodies, solubilization, such as with guanidine-HCI, and refolding 
followed by purification of the biologically active molecules. 

[0051] The activity of single chain multivalent antibodies can be measured by standard assays known in the art, for 
example competition assays, enzyme-linked immunosorbant assay (ELISA), and radioimmunoassay (RIA). 
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[0052] The multivalent single chain antibodies of the present invention provide unique benefits for use in diagnostics 
and therapeutics. The use of multivalent single chain antibodies afford a number of advantages over the use of larger 
fragments or entire antibody molecules. They reach their target tissue more rapidly, and are cleared more quickly from 
the body. 

s [0053] For diagnostic and/or therapeutic uses, the multivalent single chain antibodies can be constructed such that 
one or more anti body fragments are directed against a target tissue and one or more anti body fragments are directed 
against a diagnostic or therapeutic agent. 

[0054] The invention also concerns pharmaceutical compositions which are particularly advantageous for use in the 
diagnosis and/or therapy of diseases, such as cancer, where target antigens are often expressed on the surface of 
10 cells. For diagnostic and/or therapeutic uses, the multivalent single chain antibodies can be conjugated with an appro- 
priate imaging or therapeutic agent by methods known in the art. The pharmaceutical compositions of the invention 
are prepared by methods known in the art, e.g., by conventional mixing, dissolving or lyophilizing processes. 
[0055] The invention will be further clarified by a consideration of the following examples, which are intended to be 
purely exemplary of the present invention. 

is 

ABBREVIATIONS 





[0056] 




20 


BCIP 


5-bromo-4-chloro-3-indoyl phosphate 




bp 


base pair 




Bis-Tris propane 


(1,3-bis[tris(hydroxymethyl)-methylamino]propane) 




BSA 


bovine serum albumin 




CDR 


Complementarity determining region 


25 


ELISA 


enzyme linked immunosorbent assay 




Fv2 


non-covalent single chain Fv dimer 




IEF 


isoelectric focusing 




ftop 


kilo base pair 




LB 


Luria-Bertani medium 


30 


Mab 


monoclonal antibody 




MES 


2-(N-Morpholino)ethane sulfonic acid 




MW 


molecular weight 




NBT 


nitro blue tetrazoiium chloride 




Oligo 


Oligonucleotides 


35 


PAG 


polyacrylamide gel 




PAGE 


polyacrylamide gel electrophoresis 




PBS 


phosphate buffered saline 




PGR 


polymerase chain reaction 




pSCFV 


plasmid containing DNA sequence coding for SCFV 


40 


RIGS 


radioimmunoguided surgery 




RIT 


radioimmunotherapy 




scFv 


single chain Fv immunoglobulin fragment monomer 




scFv2 


single chain Fv immunoglobulin fragment dimer covalently linked 




SDS 


sodium dodecyl sulfate 


45 


TBS 


Tris-buffered saline 




Tris 


(Tris[hydroxymethyl]aminomethane) 




TTBS 


Tween-20 wash solution 




v H 


immunoglobulin heavy chain variable domain 


SO 




immunoglobulin light chain variable domain 


Antibodies 





[0057] CC49: A murine monoclonal antibody specific to the human tumor-associated glycoprotein 72 (TAG-72) de- 
posited as ATCC No. HB9459. 

[0058] CC49 FAB : An antigen binding portion of CC49 consisting of an intact light chain linked to the N-terminal 
portion of the heavy chain. 

[0059] CC49 scFv: Single chain antibody fragment consisting of two variable domains of CC49 antibody joined by 
a peptide linker. 
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[0060] CC49 Fv2: Two CC49 scFv non-covalently linked to form a dimer The number after Fv refers to the number 
of monomer subunits of a given molecule, e.g., CC49 Fv6 refers to the hexamer multimers. 

[0061] CC49 scFv2: Covalently-iinked single chain antibody fragment consisting of two CC49 V L domains and two 
VH domains joined by three linkers. Six possible combinations for the order of linking the V L (L) and the V H (H) domains 
s together are: LHLH, LHHL, LLHH, HLLH, HLHL, and HHLL. 

Plasm ids 

[0062] pSCFV UHM : Plasmid containing coding sequence for scFv consisting of a CC49 variable light chain and a 
10 CC49 variable heavy chain joined by a 25 amino acid linker. 

[0063] p49LHLH or p49LHHL: Plasmids containing the coding sequence for producing CC49 scFv2 LHLH or LHHL 
products, respectively. 

EXAMPLES 

15 

General Experimental 

[0064] Procedures for molecular cloning are as those described in Sambrook et al., Molecular Cloning, Cold Spring 
Harbor Press, New York, 2nd Ed. (1989) and Ausubel et al., Current Protocols in Molecular Biology, John Wiley and 
20 Sons, New York (1992), the disclosures of which are hereby incorporated by reference. 
[0065] All water used throughout was deionized distilled water. 

Oligonucleotide Synthesis and Purification 

25 [0066] All oligonuclotides (oligos) were synthesized on either a Model 380A or a Model 391 DNA Synthesizer from 
Applied Biosystems (Foster City, CA) using standard p-cyanoethyl phosphoramidites and synthesis columns. Protect- 
ing groups on the product were removed by heating in concentrated ammonium hydroxide at 55°C for 6 to 15 hours. 
The ammonium hydroxide was removed through evaporation and the crude mixtures were resuspended in 30 to 40 
ul of sterile water. After electrophoresis on polyacrylamide-urea gels, the oligos were visualized using short wavelength 

30 ultraviolet (UV) light. DNA bands were excised from the gel and eluted into 1 mL of 100 mM Tris-HCI, pH 7.4, 500 mM 
NaCI, 5 mM EDTA over 2 hours at 65°C. Final purification was achieved by applying the DNA to Sep-Pac™ C-18 
columns (Millipore, Bedford, MA) and eluting the bound oligos with 60 percent methanol. The solution volume was 
reduced to approximately 50 uL and the DNA concentration was determined by measuring the optical density at 260 
nm (OD 260 ). 

35 

Restriction Enzyme Digests 

[0067] All restriction enzyme digests were performed using Bethesda Research Laboratories (Gaithersburg, MD), 
New England Biolabs, Inc. (Beverly, MA) or Boehringer Mannheim (BM, Indianapolis, IN) enzymes and buffers following 

40 the manufacturer's recommended procedures. Digested products were separated by polyacrylamide gel electrophore- 
sis (PAGE). The gels were stained with ethidium bromide, the DNA bands were visualized using long wavelength UV 
light and the DNA bands were then excised. The gel slices were placed In dialysis tubing (Union Carbide Corp. , Chicago) 
containing 5 mM Tris, 2.5 mM acetic acid, 1 mM EDTA, pH 8.0 and eluted using a Max Submarine electrophoresis 
apparatus (Hoefer Scientific Instruments, CA). Sample volumes were reduced on a Speed \fac Concentrator (Savant 

45 Instruments, Inc., NY). The DNA was ethanol precipitated and redissolved in sterile water. 

Enzyme Linked Immunosorbent Assay (ELISA) 

[0068] TAG-72 antigen, prepared substantially as described by Johnson et al, Can. Res., 46, 850-857 (1986), was 
so adsorbed onto the wells of a polyvinyl chloride 96 well microtiter plate (Dynatech Laboratories, Inc., Chantilly, VA) by 
drying overnight. The plate was blocked with 1 percent BSA in PBS for 1 hour at 31 °C and then washed 3 times with 
200 uL of PBS, 0.05 percent Tween-20. 25 uL of test antibodies and 25 uL of biotinylated CC49 (1/20,000 dilution of 
a 1 mg/mL solution) were added to the wells and the plate incubated for 30 minutes at 31 °C. The relative amounts of 
TAG-72 bound to the plate, biotinylated CC49, streptavidinalkaline phosphatase, and color development times were 
55 determined empirically in order not to have excess of either antigen or biotinylated CC49, yet have enough signal to 
detect competition by scFv. Positive controls were CC49 at 5 |ig/mL and CC49 Fab at 10 u.g/mL. Negative controls 
were 1 percent BSA in PBS and/or concentrated LB. Unbound proteins were washed away. 50 ul of a 1 : 1 000 dilution 
of streptavidin conjugated with alkaline phosphatase (Southern Biotechnology Associates, Inc., Birmingham, AL) were 
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added and the plate was incubated for 30 minutes at 31°C. The plate was washed 3 more times. 50 \iL of a para- 
nitrophenyl-phosphate solution (Kirkegaard & Perry Laboratories, Inc., Gaithersburg, MD) were added and the color 
reaction was allowed to develop for a minimum of 20 minutes. The relative amount of scFv2 binding was measured 
by optical density scanning at 404-450 nm using a microplate reader (Molecular Devices Corporation, Manlo Park, 
s CA). Binding of the scFv2 species resulted in decreased binding of the biotinylated CC49 with a concomitant decrease 
in color development. 

SDS-PAGE and Western Blotting 

w [0069] Samples for SDS-PAGE analysis (20 were prepared by boiling in a non-reducing sample preparation 
buffer-Seprasol I (Integrated Separation Systems (ISS), Natick, MA) for 5 minutes and loaded on 10-20 percent gradient 
polyacrylamide Daiichi Minigels as per the manufacturer's directions (ISS). 

[0070] Electrophoresis was conducted using a Mini 2-gel apparatus (ISS) at 55 mA per gel at constant current for 
approximately 75 minutes. Gels were stained in Coomassie Brilliant Blue R-250 (Bio-Rad, Richmond, CA) for at least 
75 1 hour and destained. Molecular weight standards were prestained (Mid Range Kit, Diversified Biotech, Newton Center, 
MA) and included the following proteins: Phosphorylase b, glutamate dehydrogenase, ovalbumin, lactate dehydroge- 
nase, carbonic amhydrase, B-lactoglobulin and cytochrome C. The corresponding MWs are: 95,500, 55,000, 43,000, 
36,000, 29,000, 18,400, and 12,400, respectively. 

[0071] When Western analyses were conducted, a duplicate gel was also run. After electrophoresis, one of the gels 

20 was equilibrated for 15-20 minutes in anode buffer #1 (0.3 M Tris-HCI pH 10.4). An Immobilon-P PVDF (polyvinyl idene 
dichlorine) membrane (Millipore, Bedford, MA) was treated with methanol for 2 seconds, and immersed in water for 2 
minutes. The membrane was then equilibrated in anode buffer #1 for 3 minutes. A Miiliblot-SDE apparatus (Millipore) 
was utilized to transfer proteins in the gel to the membrane. A drop of anode buffer #1 was placed in the middle of the 
anode electrode surface. A sheet of Whatman 3MM filter paper was soaked in anode buffer #1 and smoothly placed 

25 on the electrode surface. Another filter paper soaked in anode buffer #2 (25 mM tris pH 10.4) was placed on top of the 
first one. A sandwich was made by next adding the wetted PVDF membrane, placing the equilibrated gel on top of this 
and finally adding a sheet of filter paper soaked in cathode buffer (25mM Tris-HCI, pH 9.4 in 40 mM glycine). Transfer 
was accomplished in 30 minutes using 250 mA constant current (initial voltage ranged from 8-20 volts). 
[0072] After blotting, the membrane was rinsed briefly in water and placed in a dish with 20 mL blocking solution (1 

30 percent bovine serum albumin (BSA) (Sigma, St. Louis, MO) in Tris-buffered saline (TBS)). TBS was purchased from 
Pierce Chemical (Rockford, IL) as a preweighed powder such that when 500 mL water is added, the mixture gives a 
25 mM Tris, 0.15 M sodium chloride solution at pH 7.6. The membranes were blocked for a minimum of 1 hour at 
ambient temperature and then washed 3 times for 5 minutes each using 20 mL 0.5 percent Tween-20 wash solution 
(TTBS). To prepare the TTBS, 0.5mL of Tween 20 (Sigma) was mixed per liter of TBS. The probe antibody used was 

35 20 m L bioti nylated FAID1 4 solution (10 u,g per 20 mL antibody buffer). Antibody buffer was made by adding 1 g BSA 
per 1 00 mL of TTBS. After probing for 30-60 minutes at ambient temperature, the membrane was washed 3 times with 
TTBS, as above. 

[0073] Next, the membrane was incubated for 30-60 minutes at ambient temperature with 20 mL of a 1 :500 dilution 
in antibody buffer of streptavidin conjugated with alkaline phosphatase (Southern Biotechnology Associates, Birming- 

40 ham, AL). The wash step was again repeated after this, as above. Prior to the color reaction, membranes were washed 
for 2 minutes in an alkaline carbonate buffer (20 mL). This buffer is 0.1 M sodium bicarbonate, 1 mM MgCI 2 -H 2 0, pH 
9.8. To make up the substrate for alkaline phosphatase, nitroblue tetrazolium (NBT) chloride (50 mg, Sigma) was 
dissolved in 70 percent dimethylformamide. 5-Bromo-4-chloro-3-indoyl phosphate (BCIP) (25 mg, Sigma) was sepa- 
rately dissolved in 100 percent dimethylformamide. 5-Bromo-4-chloro-3-indoyl phosphate (BCIP) 25 mg, Sigma) was 

45 separately dissolved in 1 00 percent dimethylformamide. These solutions are also commercially available as a Western 
developing agent sold by Promega. For color development, 120 \iL of each were added to the alkaline solution above 
and allowed to react for 15 minutes before they were washed from the developed membranes with water. 

Biotinylated FAID14 

so 

[0074] FAID14 is a murine anti-idiotypic antibody (lgG2a, K isotype) deposited as ATCC No. CRL 10256 directed 
against CC49. FAID14 was purified using a Nygene Protein A affinity column (Yonkers, NY). The manufacturer's pro- 
tocol was followed, except that 0.1 M sodium citrate, pH 3.0 was used as the elution buffer. Fractions were neutralized 
to pH-7 using 1 .0 M Tris-HCI pH 9.0. The biotinylation reaction was set up as follows. FAID14 (1 mg, 100 uL in water) 
55 was mixed with 100 |iL of 0.1 M Na 2 C0 3 pH 9.6. Biotinyl-e-amino-caproic acid N-hydroxy succinimide ester (Biotin-X- 
NHS) (Cal biochem, LaJolla, CA) (2.5 mg) was dissolved in 0.5 mL dimethylsulfoxide. Biotin-X-NHS solution (20 uL) 
was added to the FAID14 solution and allowed to react at 22°C for 4 hours. Excess biotin and impurities were removed 
by gel filtration, using a Pharmacia Superose 12 HR10/30 column (Piscataway, NJ). At a flow rate of 0.8 mL/min, the 
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biotinylated FAID14 emerged with a peak at 16.8 min. The fractions making up this peak were pooled and stored at 
4°C and used to detect the CC49 idiotype as determined by the CC49 V L and V H CDRs. 

Isoelectric Focusing (IEF) 

5 

[0075] Isoelectric points (pi's) were predicted using a computer program called PROTEIN-TITRATE, available 
through DNASTAR (Madison, Wl). Based on amino acid composition with an input sequence, a MW value is given, in 
addition to the pi. Since Cys residues contribute to the charge, the count was adjusted to 0 for Cys, since they are all 
involved in disulfide bonds. 

10 [0076] Experimentally, pi's were determined using Isogel agarose IEF plates, pH range 3-10 (FMC Bioproducts, 
Rockland, ME). A Biorad Bio-phoresis horizontal electrophoresis cell was used to run the IEF, following the directions 
of both manufacturers. The electrophoresis conditions were : 500 volts (limiting), at 20 m A current and 1 0 W of constant 
power. Focusing was complete in 90 min. IEF standards were purchased from Biorad; the kit included phycocyanin, 
p-lactoglobulin B, bovine carbonic anhydrase, human carbonic anhydrase, equine myoglobin, human hemoglobins A 

is and C, 3 lentil lectins and cytochrome C, with pi values of 4.65, 5.10, 6.00, 6.50, 7.00, 7.10 and 7.50, 7.80, 8.00, and 
8.20 and 9.60, respectively. Gels were stained and destained according to the directions provided by FMC. 

Quantitation of CC49 Antibody Species 

20 [0077] All purified CC49 antibodies including the IgG, scFv2 species and the monomeric scFv were quantitated by 
measuring the absorbence of protein dilutions at 280 mm using matching 1 .0 cm pathlength quartz cuvettes (Hellma) 
and a Perkin-Elmer UV/VIS Spectrophotometer, Model 552 A. Molar absorptivities (E m ) were determined for each an- 
tibody by using the following formula: 

25 E m = (number Trp) X 5,500 + (number Tyr) X 1 ,340 + 

(number (Cys)2) X 150 + (number Phe) X 10 

30 The values are based on information given by D. B. Wetlaufer, Advances in Protein Chemistry, T7, 375-378). 
High Performance Liquid Chromatography 

[0078] All high performance liquid chromatography (HPLC) was performed for CC49 scFv2 purification using an LKB 
35 HPLC system with titanium or teflon tubing throughout. The system consists of the Model 2150 HPLC pump, model 
2152 controller, UV CORD S11 model 2238 detection system set at an absorbence of 276 nm and the model 2211 
SuperRac fraction collector. 

PCR Generation of Subunits 

40 

[0079] All polymerase chain reactions (PCR) were performed with a reaction mixture consisting of: 150 picograms 
(pg) plasmid target (pSCFVUHM); 100 pmoles primers; 1 uL Perkin-Elmer-Cetus (PEC, Norwalk, CT) Ampli-Taq 
polymerase; 16 uL of 10 mM dNTPs and 10 ul of 10X buffer both supplied in the PEC kit; and sufficient water to bring 
the volume to total volume to 1 00 u,L. The PCR reactions were carried out essentially as described by the manufacturer. 
45 Reactions were done in a PEC 9600 thermocycler with 30 cycles of: denaturation of the DNA at 94°C for 20 to 45 sec, 
annealing from between 52 to 60°C for 0.5 to 1.5 min., and elongation at 72°C for 0.5 to 2.0 min. Oligonucleotide 
primers were synthesized on an Applied Biosystems (Foster City, CA) 380 A or 391 DNA synthesizer and purified as 
above. 

50 Ligations 

[0080] Ligation reactions using 100 ng of vector DNA and a corresponding 1 : 1 stoichiometric equivalent of insert 
DNA were performed using a Stratagene (La Jolla, CA) T4 DNA ligase kit following the manufacturer's directions. 
Ligation reactions (20 u,L total volume) were initially incubated at 18°C and allowed to cool gradually overnight to 4°C. 

55 

Transformations 

[0081] Transformations were performed utilizing 100 jiL of Stratagene E. coli AG 1 competent cell (Stratagene, La 
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Jolla, CA) according to the directions provided by the manufacturer. DNA from the ligation reactions (1-5 ul) were 
used. After the transformation step, cells were allowed to recover for 1 hr in Luria broth (LB) at 37°C with continuous 
mixing and subsequently plated onto either 20 u.g/mL chloramphenicol containing (CAM 20) Luria agar for pSCFVUHM, 
p49LHLH or p49LHHL or 100 u.g/mL ampicillin (AMP 100) Luria agar plates (LB-AMP 100) for clones contai ning the 
5 plasmid pSL301 or subsequent constructions derived from pSL301 . 

Screening of E. coli Clones 

[0082] Bacterial plasmids were isolated from LB broth culture containing the appropriate drug to maintain selection 
10 pressure using Promega (Madison, Wl) Magic mini-prep plasmid preparation kits. The kit was used per the manufac- 
turer's specifications. 

Plasmid Constructions 

75 [0083] Two plasmids, designated p49LHLH and p49LHHL, were constructed to produce multivalent single chain 
antibodies. The host cell containing p49LHLH produced a polypeptide which can be designated by V L -L-V H -L-V L -L- 
V H where V L and V H are the light and heavy cahin variable regions of CC49 antibody and linker (L) is a 25 amino acid 
linker having the sequence (SEQ ID NO: 5). 

20 

Leu-Ser-Ala-Asp-Asp-Ala-Lys-Lys-Asp-Ala-Ala-Lys-Lys-Asp-Asp-Ala-Lys-Lys-Asp- 
- Asp-Ala- Lys-Lys-Asp- Leu . 

25 [0084] The host cell containing p49LHHL produced a polypeptide which can be designated by V L -L-V H -L-V H -L-V L 
where V L and V H are the light and heavy chain variable domains of the CC49 antibody and L is a peptide linker having 
the amino acid sequence indicated above. 

[0085] The nucleotide sequence (SEQ ID NO:' 6) and amino acid sequence (SEQ ID NO: 7) of the CC49 V L -L-V H - 
L-V L -L-V H (p49LHLH) are given in Figure 6. The nucleotide sequence (SEQ ID NO: 8) and amino acid sequence (SEQ 
30 ID NO: 9) of the CC49 V L -L-V H -L-V H -L-V L (p49LHHL) are given in Figure 7. 

Construction of pSL301 HT 

[0086] The construction of pSL301 HT is illustrated in Figure 8. The Bacillus lichiformis penicillinase P (penP) ter- 

35 minator sequence was removed from the plasmid designated pSCFV UHM by a 45 minute digest with Nhe I and BamH 
I, excised from a 4.5 percent polyacrylamide gel after electrophoresis, electroeluted, ethanol precipitated and ligated 
into the same sites in the similarly prepared vector: pSL301 (Invitrogen, San Diego, CA). A procedure for preparing 
pSCFV UHM is given is U.S. patent application Sen No. 07/935,695 filed August 21, 1992, the disclosure of which is 
hereby incorporated by reference. In general, pSCFV UHM contains a nucleotide sequence for a penP promoter; a 

40 unique Nco 1 1 restriction site; CC49 V L region; Hind III restriction site; a 25 amino acid linker; a unique a Xho I restriction 
site; CC49 V H region; Nhe I restriction site; penP terminator; and BamH I restriction site (see, Figure 8). The penP 
promoter and terminator are described in Mezes, et al. (1983), J. Biol. Chem., 258 11211-11218 (1983). 
[0087] An aliquot of the ligation reaction (3 uL) was used to transform competent E. coli AG cells which were plated 
on LB-AMP100 agar plates and grown overnight. Potential clones containing the penP terminator insert were screened 

45 using a Pharmacia (Gaithersburg, MD) T7 Quickprime ^P DNA labeling kit in conjunction with the microwave colony 
lysis procedure outlined in Buluwela et aL, Nucleic Acid Research, V7, 452 (1 989). The probe, which was the penP- 
Nhe I -BamH I terminator fragment itself was prepared and used according to the directions supplied with the Quickprime 
kit. A clone which was probe positive and which contained the 207 base pair inserts from a BamH I and Nhe I digest 
(base pairs (bp) 1958 to 2165, Figure 6) was designated pSL301 T and chosen to construct pSL301 HT which would 

so contain the nudeotide sequence for CC49 V H . The reason the Nhe l-BamH I penP terminator was placed into pSL301 
was to eliminate the Eco47 IN restriction endonuclease site present in the polylinker region between its Nhe I and 
BamH I sites. This was designed to accommodate the subsequent build-up of the V L and V H domains where the Eco47 
III site needed to be unique for the placement of each successive V domain into the construction. As each V domain 
was added at the Eco47 lll-Nhe I sites, the Eco47 III was destroyed in each case to make the next Eco47 III site coming 

55 in on the insert unique. 

[0088] The V H sequence was made by PCR with oligos 5' SCP1 and 3'oligo SCPS using pSCFV U HM as the target 
for PCR amplification. The DNA sequence for SCP1 (SEQ ID NO: 10) and SCPS (SEQ ID NO: 11) are as follows: 
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SOM : 5 -TAAA CTCGAQ GTT CAG TTG GAG CAG -3' 

SCP5: 5'-TAAA GCTAGC ACCA AGCGCT TAG TGA GGA GAC GGT GAC TGA GGT-3' 

5 

The underlined portion indicates the endonuclease restriction sites. 

[0089] The amplified V H DNA was purified from a 4 percent PAG, eiectroeluted ethanol precipitated and dissolved 
in 20 |iL water. The V H sequence was digested with Xho I and Nhe I restriction enzymes and used as the insert with 
the pSL301 T vector which had been digested with the same restriction enzymes and subsequently purified. A standard 
10 ligation reaction was done and an aliquot (4 uX) used to transform competent E. coli AG1 cells. The transformed cells 
were plated onto LB AMP 100 agar plates. Candidate dones were picked from a Nhe I and Xho I digest screen that 
revealed that the CC49V H insert had been obtained. 

[0090] DNA sequencing was performed to verify the sequence of the CC49V H with United States Biochemical (USB) 
(Cleveland, Ohio) Sequence kit and sequencing primers pSL301SEQB (a 21 bp sequencing primer which annealed 
is in the pSL301 vector 57 bp upstream from the Xho I site) and CC49VHP, revealed clones with the correct CC49V H 
sequence in pSL301HT This plasmid was used as the starting point in the construction of both pSL301-HHLT and 
pSL301-HLHT The sequencing oligos used are shown here. 

[0091] The nucleotide sequence of pSL301 SEQ B (SEQ ID NO: 1 2) and CC49V H (SEQ ID No: 1 3) are as follows: 

20 

pSL301 SEQB: 5'-TCG TCC GAT TAG GCA AGC TTA-3" 
CC49VHP: 5'-GAT GAT TTT AAA TAC AAT GAG-3' 

25 Example 1 p49LHHL Construction 

[0092] Using pSL301 HT (5 u.g) as the starting material, it was digested with Eco47 III and Nhe I and the larger vector 
fragment was purified. A CC49V H insert fragment was generated by PCR using SCP6B is the 5' oligo and SCPS as 
the 3* oligo. The nucleotide sequence (SEQ ID NO: 14) of SCP6B is as follows: 

30 

SCP6B: 5 TAAA TGCGCA GAT GAC GCA AAG AAA GAC GCA GCT AAA AAA GAC GAT 
GCC AAA AAG GAT GAC GCC AAG AAA GAT CTT GAG GTT CAG TTG CAG CAG 

35 tct-g- 

The oligo SCP6B also contains part of the coding region for the linker (bp 8-76 of SEQ ID NO: 14). The portion of the 
oligo designed to anneal with the CC49VH target in pSCFV UHM is from bp77-90 in SEQ ID NO: 14. 
[0093] The underlined sequence corresponds to the Fsp I site. The resulting PCR insert was purified, digested with 
40 Fsp I and Nhe I and used in a ligation reaction with the pSL301 HT Eco47 lll-Nhe I vector (Figure 7). Competent E. 
coli AG1 cells were used for the transformation of this ligation reaction (3 |iL) and were plated on LB-AMP 100 agar 
plates. Two clones having the correct size Xho l-Nhe I insert representative of the pSL301 HHT product were sequenced 
with the oligo SQP1 and a single clone with the correct sequence (nucleotides 11 24-1 543 of Figure 7) was chosen for 
further construction. The nucleotide sequence of SQP1 (SEQ ID NO: 15 is as follows: 

45 

SQP1 : 5-TG ACT TTA TGT AAG ATG ATG T-3' 

[0094] The final linker-V L subunit (bp 1 544-1 963, Figure 7) was generated using the 5'oligo, SCP7b and the 3' oligo, 
so SCP8a, using pSCFV UHM as the target for the PCR. The nucleotide sequence of SCP7b (SEQ I D NO: 1 6) is as follows: 

SCP7b: S'-TAAA TGC GCA GAT GAC GCA AAG AAA GAC GCA GCT AAA AAA GAC GAT 

GCC AAA AAG GAT GAC GCC AAG AAA GAT CTT GAC ATT GTG ATG TCA CAG TCT 

55 

CC 

The underlined nucleotides correspond to an Fsp I site. The nucleotide sequence of SCP8a (SEQ ID NO: 17) is as 
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follows: 

SCP8a: 5'-TAAA GCT AGC TTT TTA CTT AAG CAC CAG CTT GGT CCC-3' 

5 

[0095] The first set of underlined nucleotides correspond to an Nhe I site, while the other corresponds to an Afl II 
site. Nucleotides B-76 of SCP70 code for the linker (nucleotides 1544-1612 of Figure 7) while nucleotides 77-99 which 
anneal to the V L correspond to 1 61 3-1635 of Figure 7. The primer SCP8a has a short tail at its 5' end, a Nhe I restriction 
site, a stop codon, an Afl II restriction site and the last 21 bases of the V L . After Fsp I and Nhe I digestion, this resulting 
10 420 bp insert was purified and ligated into the Nhe I and Eco47 III sites of the purified pSL301 HHT vector, candidate 
clones were screened with Nhe I and Xho I, the correct size insert verified and sequenced with 49LFR2(-) and SQP1 
to confirm the newly inserted sequence in pSL301 HHLT. The nucleotide sequence (SEQ ID NO: 18) is as follows: 

75 49LFR2<-): 5'-CTG CTG GTA CCA GGC CAA G-3' 

[0096] The plasmid pSL301HHLT was digested with Xho I and Nhe I, purified, and the resulting 1179 bp V H -linker- 
V H -linker-V L segment ligated into pSCFV UHM, which had been cut with the same restriction enzymes and the larger 
vector fragment purified, to form p49LHHL The ligation reaction (4 jiL aliquot) was used to transform competent E, 
20 coli AG 1 cells (Stratagene) and plated onto LBC AM20 agar plates. A single clone which had a plasmid with the correct 
restriction enzyme map was selected to contain p49LHHL The p49LHHL contains a penP promoter and a nucleotide 
sequence for the CC49 multivalent single chain anti body scFv2: 
V L -L-V H -L-V H -L-V L or CC49 scFv2 (LHHL). 

25 Example 2 : p49LHLH Construction 

[0097] The construction of p49LHLH is schematically represented in Figure 10. A linker-V L subunit was generated 
with the 5' oligo SCP7b and the 3'oligo SCP9 (SEQ ID NO: 19). 

30 

SCP9: S'-TAA A GC TAG C AC CAAGCG CTT AGT TTC AGC ACC AGC TTG GTC CCA G-3' 

[0098] The SCP7b oligonucleotides 8-76) codes for the linker in Figure 6 (corresponding to nucleotides 1124-1192) 
and annealed to the pSCFV UHM target for the PCR (nucleotides 77-99) corresponding to nucleotides 1193-1215 of 
35 the V L in Figure 6. 

[0099] SCP9 has a Nhe I site (first underlined nucleotides) and an Eco47 III site (second underlined nucleotides) 
which are restriction sites needed for making the pSL301 HLT ready to accept the next V domain. Nucleotides 18-23 
of SCP9 correspond to nucleotides 1532-1537 of Figure 6 (coding for the first 2 amino acids of the linker), while nu- 
cleotides 24-46 correspond to nucleotides 1508-1531 of Figure 6 which was also the annealing region for SCP9 in the 
40 PCR. The plasmid pSL301 HT was digested with Eco47 Hi and Nhe I and the larger vector fragment was purified for 
ligation with the linker-CC49V L DNA insert fragment from the PCR which had been treated with Fsp I and Nhe I and 
purified. The ligation mixture (3 |il_) was used to transform E. coli AG 1 competent cells and one colony having the 
correct Xho l-Nhe I size fragment was sequenced using the oligo PENPTSEQ2. The nudeotide sequence (SEQ. ID 
NO.20) is as follows: 

45 

5' -TTG ATC ACC AAG TG A CTT TAT G-3' 

[0100] The sequencing results indicated that there had been a PCR error and deletion in the resulting pSL301HT 
50 clone. A five base deletion, corresponding to nucleotides 1533-1537 as seen in Figure 6 had been obtained and nu- 
cleotide 1 531 which should have been a T was actually a G, as determined from the DNA sequence data. The resulting 
sequence was 

55 5*...G A AGC GCT T... etc. 

where the underlined sequence fortuitously formed an Eco47 III site. The AGCGCT sequence in Figure 6, would 
correspond to nucleotides 1 530, 1531, 1 532, 1 538, 1 539 and 1 540. This error was corrected in the next step, generating 
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pSL301 HLHT, by incorporating the 5 base deletion at the end of oligo SCP6C (SEQ ID NO: 21) . 

SCP6C: S^TA AGCGC TGATGATGCTAAGAAGGACGCCGCAAAAAA 
5 GGACGACGCAAAAAAAGATGATGCAAAAAAGGATCTGG 

AGGTTCAGTTG CAG CAGTCTGAC-3 ' 

[0101] The underlined sequence in SCP6c corresponds to an Eco47 III site. SCP6C was used as the 5* oligo, with 
10 SCP10 as the 3' oligo in a PCR to generate a linker CC49 V L segment. The nudeotide sequence (SEQ ID NO: 22) is 
as follows: 

SCP10: 5TTG T GC TAG C TT TTT ATG AGG AG A CGG TGA CTG AGGTT-3' 

75 

[0102] The underlined sequence in SCP10 corresponds to the Nhe I site found at nucleotides 1958-1963 in Figure 
6. The PCR insert was digested this time only with Nhe I and purified. The vector (pSL301 HLT) was digested at the 
Eco47 III site (that had been formed) and Nhe I and purified. The insert and vector were ligated and an aliquot (3 ul) 
used to transform competent E. coii AG1 ceils. This was plated on LB-AMP 100 plates and candidate clones screened 
20 with Xho I and Nhe I. Three clones having the correct size DNA were obtained. Two of these clones were sequenced 
using the oligo 49VLCDR3(+) and SQP1. The nucleotide sequence (SEQ ID NO:23) of 49VLCDR3(+) is as follows: 

49VLCDR3( + ): 

25 S'-CAG CAG TAT TAT AGC TAT-3' 

[0103] One clone, with the correct sequence was obtained and the sequence from nucleotides 1533to 1963 in Figure 
6 were verified, giving a correct pSL301 HLHL clone. 

30 [0104] To generate the final plasmid, p49LHLH for expression in E. coli, pSL301 HLHT (5 \ig) was digested with Nhe 
I and Xho I, and the smaller insert fragment containing the V H -L-V L -L-V H sequence purified. It was ligated with the 
larger purified vector fragment from a digest of pSCFV UHM (5 u,g) with Xho I and Nhe I. An aliquot of the ligation mix 
(4 |iL) was used to transform competent E. coli AG1 cells. The transformation mix was plated on LB-CAM20 plates, 
and a representative clone for p49 LHLH was selected on the basis of a correct restriction enzyme map (see Figure 

35 10) and biological activity toward TAG-72. 

Example 3 : Purification of CC49 scFv2 LHLH and LHHL Covalently Linked Dimers 

[0105] For the purification of the CC49 covalently linked single chain dimers. (scFv2), E. coli periplasm ic fractions 

40 were prepared from 1 .0 L overnight cultures of both p49LHLH and p49LHHL Briefly, the culture was divided into 4 X 
250 mL portions and centrifuged at 5,000 rpm for 10 minutes in a Sorvall GS-3 rotor. The pelleted cells were washed 
and resuspended in 100 mL each of 10 mM Tris-HCI pH 7.3 containing 30 mM NaCI. The cells were again pelleted 
and washed with a total of 100 mL 30 mM Tris-HCI pH 7.3 and pooled into one tube. To this, 100 mL of 30 mM Tris- 
HCI pH 7.3 containing 40 percent w/v sucrose and 2.0 mL of 10 mM EDTA pH 7.5 was added. The mixture was kept 

45 at room temperature, with occasional shaking, for 10 minutes. The hypertonic cells were then pelleted as before. In 
the next step, the shock, the pellet was quickly suspended in 20 mL ice cold 0.5 mM MgC^ and kept on ice for 10 
minutes, with occasional shaking. The cells were pelleted as before and the supernatant containing the E. co// peri- 
plasm^ fraction was clarified further by filtration through a 0.2 urn Nalge (Rochester, NY) filter apparatus and concen- 
trated in Amicon (Danvers, MA) Centriprep 30 and Centricon 30 devices to a volume of less than 1 .0 mL. 

so [0106] The concentrated peri plasmic shockates from either the p49LHLH or p49LHHL clones were injected onto a 
Pharmacia (Piscataway, NJ) Superdex 75 HR 10/30 HPLC column that had been equilibrated with PBS. At a flow rate 
of 0.5 mL/minute, the product of interest, as determined by competition ELISA, had emerged between 21 through 24 
minutes. The active fractions were pooled, concentrated as before and dialyzed overnight using a system 500 Micro- 
dialyzer Unit (Pierce Chemical) against 20 mM Tris-HCI pH 7.6 with 3-4 changes of buffer and using an 8,000 MW 

55 cutoff membrane. The sample was injected on a Pharmacia Mono Q HR 5/5 anion exchange HPLC column. A gradient 
program using 20 mM Tris-HCI pH 7.6 as buffer A and the same solution plus 0.5 M NaCI as buffer B was employed 
at a flow rate of 1 .5 ml7min. The products of interest in each case, as determined by competition ELISA, emerged from 
the column between 3 and 4 minutes. Analysis of the fractions at this point on duplicate SDS-PAGE gels, one stained 



13 



EP 0 628 078 B1 



with Coomassie Brilliant Blue R-250 and the other transferred for Western analysis (using biotinylated FAID 14 as the 
probe antibody) revealed a single band at the calculated molecular weight for the scFv2 (LHLH or LHHL) species at 
58,239 daltons. The active fractions were in each case concentrated, dialysed against 50 mM MES pH 5.8 overnight 
and injected on a Pharmacia Mono S HR 5/5 cation exchange column. The two fractions of interest from this purification 

s step, as determined by SDS-PAGE and ELISA, fractions 5 and 6, eluted just before the start of the gradient, so they 
had not actually bound to the column. Fractions 5 and 6 were consequently pooled for future purification. 
[0107] A Mono Q column was again run on the active Mono S fractions but the buffer used was 20 mM Tris-HCI, pH 
8.0 and the flow rate was decreased to 0.8 mL/minute. The products emerged without binding, but the impurity left 
over from the Mono S was slightly more held up, so that separation did occur between 5 and 6 minutes. After this run, 

10 the products were homogeneous and were saved for further characterization. 

Isoelectric Focusing 

[0108] The isoelectric points (pi) of the constructs was predicted using the DNASTAR (Madison, Wl) computer pro- 
fs gram Protein-titrate. Based on amino acid composition, a MW and pi value was calculated. 

[0109] Experimentally, pis were determined using FMC Bioproducts (Rockland, ME) Isogel IEF plates, pH range 
3-10. A Biorad (Richmond, CA) electrophoresis unit was used to run the IEF, following the directions of both manufac- 
turers. The electrophoresis conditions were as follows: 500 V (limiting) at 20 mA and at 10 W of constant power. 
Focusing was complete in 90 minutes. Biorad IEF standards included phycocyanin, beta lactoglobulin B, bovine car- 
20 bonic anhydrase, human carbonic anhydrase, equine myoglobulin, human hemoglobins A and C, 3 lentil lectin, and 
cytochrome C with pi value of 4.65, 5.10, 6.00, 6,50, 7.00, 7.50, 7.8, 8.00, 8.20 and 9.6, respectively. Gels were stained 
and destained according to directions provided by FMC. The DNASTAR program predicted values of 8.1 for the pi for 
both scFv2 species. A single, homogeneous band for the pure products was observed on the gel at pi values for both 
at 6.9. 

25 [0110] Purified CC49 antibodies such as the IgG, scFv2 (LHLH and LHHL) were quantitated by measuring the ab- 
sorbence spectrophotometrically at 280 nm. Molar absorbtivity values, e M , were determined for each using the formula 
cited above by Wetlaufer. 

[0111] Based on the amino acid composition, the E°- 1% (280 nanometers) values for CC49 IgG, CC49 scFv2 LHLH, 
CC49 scFv2 LHHL and CC49 scfv were 1 .49, 1 .65, 1 .65 and 1 .71 , respectively. 

30 

Example 4 

[0112] Relative activities of the CC49 scFv2 species LHLH and LHHL, were compared with the IgG and a monomer 
scfv form with a FLAG peptide at the COOH terminus. 
35 [0113] Percent competition was determined from the ELISA data by the following equation: 

Zero competition - sample reading (QD405-450 nm) x 1QQ 
zero competition - 100 percent competition 

40 [0114] The "zero competition" value was determined by mixing (1:1) one percent BSA with the biotinylated CC49 
(3 X 10-14 moles) while the 100 percent competition value was based on a 5 |xg/mL sample of CC49 IgG mixed with 
the biotinylated CC49 IgG. The data are presented in Figure 11 . Absorbence values for the samples were measured 
at 405 nm - 450 nm. The average of triplicate readings was used. Initially samples (25 pL) were applied to the TAG- 
72 coated microliter plates at 1.0 X 10-10 moles of binding sites/mL Biotinylated CC49 (4 |ig/uL diluted 1:20,000 - 

45 used 25 ul) diluted the samples by a factor of 2. Serial dilutions (1 :2) were performed. Both forms of the scFv2 are 
approximately equivalent to the IgG (see Figure 11). In a separate experiment, a CC49 scFv monomer was compared 
to a Fab fragment, both of which are monovalent and these were also shown to be equivalent in their binding affinity 
for TAG-72. These results indicate that both forms of the covalently linked dimers have 2 fully functional antigen binding 
sites. This is the same increase in avidity as observed with the whole IgG, relative to a monomeric species. 

50 [0115] These data also indicate that the scFv2 molecules, like their CC49 IgG parent are candidates for immuno- 
therapeutic applications, but with the benefit of increased capillary permeability and more rapid biodistribution phar- 
macokinetics. The advantage should allow multiple injections of compounds of the present invention and give higher 
tumortissue ratios in immunotherapeutic treatment regimens for cancer treatment, relative to the existing IgG mole- 
cules. 

55 [0116] Other embodiments of the invention will be apparent to those skilled in the art from a consideration of this 
specification or practice of the invention disclosed herein. It is intended that the specification and examples be con- 
sidered as exemplary only, with the true scope and spirit of the invention being indicated by the following claims. 
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SEQUENCE LISTING 
(1) GENERAL INFORMATION: 
5 [0117] 

(i) APPLICANT: The Dow Chemical Company 

(ii) TITLE OF INVENTION: MULTIVALENT SINGLE CHAIN ANTIBODIES 

10 

(iii) NUMBER OF SEQUENCES: 23 

(iv) CORRESPONDENCE ADDRESS: 

is (A) ADDRESSEE: Duane C. Ulmer 

(B) STREET: P.O. Box 1967 

(C) CITY: Midland 

(D) STATE: Ml 

(E) COUNTRY: US 
20 (F)ZIP: 48641-1967 

(v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 
25 (B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS/MS-DOS 

(D) SOFTWARE: Patentln Release #1.0, Version #1.25 

(vi) CURRENT APPLICATION DATA: 

30 

(A) APPLICATION NUMBER: 

(B) FILING DATE: 

(C) CLASSIFICATION: 

35 (viii) ATTORNEY/AGENT INFORMATION: 

(A) NAME: Ulmer, Duane C 

(B) REGISTRATION NUMBER: 34,941 

(C) REFERENCE/DOCKET NUMBER: 41,014-F 

40 

(ix) TELECOMMUNICATION INFORMATION: 
(A) TELEPHONE: (517) 636-8104 
45 (2) INFORMATION FOR SEQ ID NO:1 : 
[0118] 

(i) SEQUENCE CHARACTERISTICS: 

so 

(A) LENGTH: 339 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

55 (xi) SEQUENCE DESCRIPTION: SEQ ID XO:1 : 
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GACATTGTGA TGTCACAGTC TCCATCCTCC CTACCTGTGT CAGTTGGCGA GAAGGTTACT 60 

TTGAGCTGCA AGTCCAGTCA GAGCCTTTTA TATAGTGGTA ATCAAAAGAA CTACTTGGCC 120 

TGGTACCAGC AGAAACCAGG GCAGTCTCCT AAACTGCTGA TTTACTGGGC ATCCGCTAGG 180 

GAATCTGGGG TCCCTGATCG CTTCACAGGC AGTGGATCTG GGACAGATTT CACTCTCTCC 240 

ATCAGCAGTG TGAAGACTGA AGACCTGGCA GTTTATTACT GTCAGCAGTA TTATAGCTAT 300 

CCCCTCACGT TCGGTGCTGG GACCAAGCTG GTGCTGAAG 339 

(2) INFORMATION FOR SEQ ID NO:2: 
[0119] 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 11 3 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:2: 

Asp lie Val Met Ser Gin Ser Pro Ser Ser Lea Pro Val Ser Val Gly 
15 10 15 

Glu Lys Val Thr Leu Ser Cys Lys Ser Ser Gin Ser Leu Leu Tyr Ser 
20 25 30 

Gly Asn Gin Lys Asn Tyr Leu Ala Trp Tyr Gin Gin Lys Pro Gly Gin 
35 40 45 

Ser Pro Lys Leu Leu He Tyr Trp Ala Ser Ala Arg Glu Ser Gly Val 
50 55 60 

Pro Asp Arg Phe Thr Gly Ser Gly Ser Gly Thr Asp Phe Thr Leu Ser 
65 70 75 80 

lie Ser Ser Val Lys Thr Glu Asp Leu Ala Val Tyr Tyr Cys Gin Gin 
85 90 95 

Tyr Tyr Ser Tyr Pro Leu Thr Phe Gly Ala Gly Thr Lys Leu Val Leu 
100 105 HO 

Lys 

(2) INFORMATION FOR SEQ ID NO:3: 
[0120] 

(i) SEQUENCE CHARACTERISTICS: 
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. (A) LENGTH: 345 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:3: 



GAGGTTCAGT 


TGCAGCAGTC 


TGACGCTGAG 


TTGGTGAAAC 


CTGGGGCTTC 


AGTGAAGATT 


60 


TCCTGCAAGG 


CTTCTGGCTA 


CACCTTCACT 


GACCATGCAA 


TTCACTGGGT 


GAAACAGAAC 


120 


CCTGAACAGG 


GCCTGGAATG 


GATTGGATAT 


TTTTCTCCCG 


GAAATGATGA 


TTTTAAATAC 


180 


AATGAGAGGT 


TCAAGGGCAA 


GGCCACACTG 


ACTGCAGACA 


AATCCTCCAG 


CACTGCCTAC 


240 


GTGCAGCTCA 


ACAGCCTGAC 


ATCTGAGGAT 


TCTGCAGTGT 


ATTTCTGTAC 


AAGATCCCTG 


300 


AATATGGCCT 


ACTGGGGTCA 


AGGAACCTCA 


GTCACCGTCT 


CCTCA 




345 



(2) INFORMATION FOR SEQ ID NO:4: 
[0121] 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 115 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:4: 



Glu Val Gin Leu Gin Gin Ser Asp Ala Glu Leu Val Lys Pro Gly Ala 
15 10 15 

Ser Val Lys He Ser Cys Lys Ala Ser Gly Tyr Thr Phe Thr Asp His 
20 25 30 

Ala He His Trp Val Lys Gin Asn Pro Glu Gin Gly Leu Glu Trp He 
35 40 45 

Gly Tyr Phe Ser Pro Gly Asn Asp Asp Phe Lys Tyr Asn Glu Arg Phe 
50 55 60 

Lys Gly Lys Ala Thr Leu Thr Ala Asp Lys Ser Ser Ser Thr Ala Tyr 
65 70 75 80 
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Val Gin Leu Asn Ser Leu Thr Sec Glu Asp Ser Ala Val Tyr Phe Cys 
85 90 95 

5 Thr Arg Ser Leu Asn Met Ala Tyr Trp Gly Gin Gly Thr Ser Val Thr 

100 105 110 

Val Ser Ser 
115 

10 

(2) INFORMATION FOR SEQ ID NO:5: 
[0122] 

is (i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 
20 (D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:5: 

25 Leu Ser Ala Asp Asp Ala Lys Lys Asp Ala Ala Lys Lys Asp Asp Ala 

15 10 15 

Lys Lys Asp Asp Ala Lys Lys Asp Leu 
20 25 

50 

(2) INFORMATION FOR SEQ ID NO:6: 
[0123] 

3S (j) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2165 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 
40 (D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:6: 



45 



SO 



CTCATGTTTG 


ACAGCTTATC 


ATCGATGAAT 


TCCATCACTT 


CCCTCCGTTC 


ATTTGTCCCC 


60 


GGTGGAAACG 


AGGTCATCAT 


TTCCTTCCGA 


AAAAACGGTT 


GCATTTAAAT 


CTTACATATA 


120 


TAATACTTTC 


AAAGACTACA 


TTTGTAAGAT 


TTGATGTTTG 


AGTCGGCTGA 


AAGATCGTAC 


180 


GTACCAATTA 


TTGTTTCGTG 


ATTGTTCAAG 


CCATAACACT 


GTAGGGATAG 


TGGAAAGAGT 


240 


GCTTCATCTG 


GTTACGATCA 


ATCAAATATT 


CAAACGGAGG 


GAGACGATXT 


TGATGAAATA 


300 


CCTATTGCCT 


ACGGCAGCCG 


CTGGATTGTT 


ATTACTCGCT 


GCCCAACCAG 


CCATGGCCGA 


360 
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CATTGTGATG TCACAGTCTC CATCCTCCCT 
5 GAGCTGCAAG TCCAGTCAGA GCCTTTTATA 

GTACCAGCAG AAACCAGGGC AGTCTCCTAA 
ATCTGGGGTC CCTGATCGCT TCACAGGCAG 

10 

CAGCAGTGTG AAGACTGAAG ACCTCGCAGT 
CCTCACGTTC GGTGCTGGGA CCAAGCTGGT 
75 GGATGCTGCG AAGAAGGATG ACGCTAAGAA 

GTTGCAGCAG TCTGACGCTG AGTTGGTGAA 
GGCTTCTGGC TACACCTTCA CTGACCATGC 

20 

GGGCCTGGAA TGGATTGGAT ATTTTTCTCC 
GTTCAAGGGC AAGGCCACAC TGACTGCAGA 
25 CAACAGCCTG ACATCTGAGG ATTCTGCAGT 

CTACTGGGGT CAAGGAACCT CAGTCACCGT 
AGACGCAGCT AAAAAAGACG ATGCCAAAAA 

30 

GATGTCACAG TCTCCATCCT CCCTACCTGT 
CAAGTCCAGT CAGAGCCTTT TATATAGTGG 
35 GCAGAAACCA GGGCAGTCTC CTAAACTGCT 

GGTCCCTGAT CGCTTCACAG GCAGTGGATC 
TGTGAAGACT GAAGACCTGG CAGTTTATTA 

40 

GTTCGGTGCT GGGACCAAGC TGGTGCTGAA 
CGCAAAAAAG GACGACGCAA AAAAGGATGA 
45 GCAGTCTGAC GCTGAGTTGG TGAAACCTGG 

TGGCTACACC TTCACTGACC ATGCAATTCA 
GGAATGGATT GGATATTTTT CTCCCGGAAA 

50 

GGGCAAGGCC ACACTGACTG CAGACAAATC 
CCTGACATCT GAGGATTCTG CAGTGTATTT 

55 



ACCTGTGTCA GTTGGCGAGA AGGTTACTTT 420 
TAGTGGTAAT CAAAAGAACT ACTTGGCCTG 480 
ACTGCTGATT TACTGGGCAT CCGCTAGGGA 540 
TGGATCTGGG ACAGATTTCA CTCTCTCCAT 600 
TTATTACTGT CAGCAGTATT ATAGCTATCC 660 
GCTGAAGCTT AGTGCGGACG ATGCGAAAAA 720 
AGACGATGCT AAAAAGGACC TCGAGGTTCA 780 
ACCTGGGGCT TCAGTGAAGA TTTCCTGCAA 840 
AATTCACTGG GTGAAACAGA ACCCTGAACA 900 
CGGAAATGAT GATTTTAAAT ACAATGAGAG 960 

CAAATCCTCC AGCACTGCCT ACGTGCAGCT 1020 

GTATTTCTGT ACAAGATCCC TGAATATGGC 1080 

CTCCTCACTA AGCGCAGATG ACGCAAAGAA 1140 

GGATGACGCC AAGAAAGATC TTGACATTGT 1200 

GTCAGTTGGC GAGAAGGTTA CTTTCAGCTG 1260 

TAATCAAAAG AACTACTTGG CCTGGTACCA 1320 

GATTTACTGG GCATCCGCTA GGGAATCTGG 1380 

TGGGACAGAT TTCACTCTCT CCATCAGCAG 1440 

CTGTCAGCAG TATTATAGCT ATCCCCTCAC 1500 

GCTAAGCGCT GATGAtGCTA AGAAGGACGC 1560 

TGCAAAAAAG GATCTGGAGG TTCAGTTGCA 1620 

GGCTTCAGTG AAGATTTCCT GCAAGGCTTC 1680 

CTGGGTGAAA CAGAACCCTG AACAGGGCCT 17 40 

TGATGATTTT AAATACAATG AGAGGTTCAA 18 00 

CTCCAGCACT GCCTACGTGC AGCTCAACAG 1860 

CTGTACAAGA TCCCTGAATA TGGCCTAC TG 1920 
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GGGTCAAGGA ACCTCAGTCA CCGTCTCCTC ATAAAAAGCT AGCGATGAAT CCGTCAAAAC 1980 

ATCATCTTAC ATAAAGTCAC TTGGTCATCA AGCTCATATC ATTGTCCGGC AATGGTGTGG 2040 

GCTTTTTTTG TTTTCTATCT TTAAAGATCA TGTGAAGAAA AACGGGAAAA TCGGTCTGCG 2100" 

GGAAAGGACC GGGTTTTTGT CGAAATCATA GGCGAATGGG TTGGATTGTG ACAAAATTCG 2160 

GATCC 2165 



(2) INFORMATION FOR SEQ ID NO:7: 
[0124] 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 553 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ix) FEATURE: 

(A) NAME/KEY: Protein 

(B) LOCATION: 23 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:7: 



Met Lys Tyr Leu Leu Pro Thr Ala Ala Ala Gly Leu Leu Leu Leu Ala 
-20 -15 -10 

Ala Gin Pro Ala Met Ala Asp He Val Met Ser Gin Ser Pro Ser Ser 
-5 15 10 

Leu Pro Val Ser Val Gly Glu Lys Val Thr Leu Ser Cys Lys Ser Ser 
15 20 25 

Gin Ser Leu Leu Tyr Ser Gly Asn Gin Lys Asn Tyr Leu Ala Trp Tyr 
30 35 40 

Gin Gin Lys Pro Gly Gin Ser Pro Lys Leu Leu He Tyr Trp Ala Ser 
45 50 55 

Ala Arg Glu Ser Gly Val Pro Asp Arg Phe Thr Gly Ser Gly Ser Gly 
60 65 70 

Thr Asp phe Thr Leu Ser He Ser Ser Val Lys Thr Glu Asp Leu Ala 
75 80 85 90 



Val Tyr Tyr Cys Gin Gin Tyr Tyr Ser Tyr Pro Leu Thr Phe Gly Ala 
95 100 105 
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Gly Thr Lys Leu Val Leu Lys Leu Ser Ala Asp Asp Ala Lys Lys Asp 
110 115 120 

Ala Ala Lys Lys Asp Asp Ala Lys Lys Asp Asp Ala Lys Lys Asp Leu 
125 130 135 

Glu Val Gin Leu Gin Gin Ser Asp Ala Glu Leu Val Lys Pro Gly Ala 
140 145 150 

Ser Val Lys lie Ser Cys Lys Ala Ser Gly Tyr Thr Phe Thr Asp His 
155 160 165 170 

Ala Zle His Trp Val Lys Gin Asn Pro Glu Gin Gly Leu Glu Trp lie 
175 180 IBS 

Gly Tyr Phe Ser Pro Gly Asn Asp Asp Phe Lys Tyr Asn Glu Arg Phe 
190 195 200 

Lys Gly Lys Ala Thr Leu Thr Ala Asp Lys Ser Ser Ser Thr Ala Tyr 
205 210 215 

Val Gin Leu Asn Ser Leu Thr Ser Glu Asp Ser Ala Val Tyr Phe Cys 
220 225 230 

Thr Arg Ser Leu Asn Net Ala Tyr Trp Gly Gin Gly Thr Ser Val Thr 
235 240 245 250 

Val Ser Ser Leu Ser Ala Asp Asp Ala Lys Lys Asp Ala Ala Lys Lys 
255 260 265 

Asp Asp Ala Lys Lys Asp Asp Ala Lys Lys Asp Leu Asp He Val Met 
270 275 280 

Ser Gin Ser Pro Ser Ser Leu Pro Val Ser Val Gly Glu Lys Val Thr 
285 290 295 

Leu Ser Cys Lys Ser Ser Gin Ser Leu Leu Tyr Ser Gly Asn Gin Lys 
300 305 310 

Asn Tyr Leu Ala Trp Tyr Gin Gin Lys Pro Gly Gin Ser Pro Lys Leu 
315 320 325 330 

Leu He Tyr Trp Ala Ser Ala Arg Glu Ser Gly Val Pro Asp Arg Phe 
335 340 345 

Thr Gly Ser Gly Ser Gly Thr Asp Phe Thr Leu Ser He Ser Ser Val 
350 355 360 



Lys Thr Glu Asp Leu Ala Val Tyr Tyr Cys Gin Gin Tyr Tyr Ser Tyr 
365 370 375 
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Pro Leu Thr Phe Gly Ala Gly Thr Lys Leu Val Leu Lys Leu Ser Ala 
380 385 390 

Asp Asp Ala Lys Lys Asp Ala Ala Lys Lys Asp Asp Ala Lys Lys Asp 
395 400 40S 410 

Asp Ala Lys Lys Asp Leu Glu Val Gin Leu Gin Gin Ser Asp Ala Glu 
415 420 425 

Leu Val Lys Pro Gly Ala Ser Val Lys lie Ser Cys Lys Ala Ser Gly 
430 435 440 

Tyr Thr Phe Thr Asp His Ala lie His Trp Val Lys Gin Asn Pro Glu 
445 450 455 

Gin Gly Leu Glu Trp lie Gly Tyr Phe Ser Pro Gly Asn Asp Asp Phe 
460 465 470 

Lys Tyr Asn Glu Arg Phe Lys Gly Lys Ala Thr Leu Thr Ala Asp Lys 
475 480 485 490 

Ser Ser Ser Thr Ala Tyr Val Gin Leu Asn Ser Leu Thr Ser Glu Asp 
495 500 505 

Ser Ala Val Tyr Phe Cys Thr Arg Ser Leu Asn Met Ala Tyr Trp Gly 
510 515 520 



Gin Gly Thr Ser Val Thr Val Ser Ser 
525 530 



(2) INFORMATION FOR SEQ ID NO:8: 
[0125] 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2165 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:8: 



CTCATGTTTG ACAGCTTATC ATCGATGAAT 
GGTGGAAACG AGGTCATCAT TTCCTTCCGA 
TAATACTTTC AAAGACTACA TTTGTAAGAT 
GTACCAATTA TTGTTTCGTG ATTGTTCAAG 
GCTTCATCTG GTTACGATCA ATCAAATATT 



TCCATCACTT CCCTCCGTTC ATTTGTCCCC 60 

AAAAACGGTT GCATTTAAAT CTTACATVTA 120 

TTGATGTTTG AGTCGGCTGA AAGATCGTAC 180 

CCATAACACT GTAGGGATAG TGGAAAGAGT 240 

CAAACGGAGG GAGACGATTT TGATGAAATA 300 
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CCTATTGCCT ACGGCAGCCG CTGGATTGTT ATTACTCGCT GCCCAACCAG CCATGGCCGA 360 

5 CATTGTGATG TCACAGTCTC CATCCTCCCT ACCTGTGTCA GTTGGCGAGA AGGTTACTTT 420 

GAGCTGCAAG TCCAGTCAGA GCCTTTTATA TAGTGGTAAT CAAAAGAACT ACTTGGCCTG 480 

GTACCAGCAG AAACCAGGGC AGTCTCCTAA ACTGCTGATT TACTGGGCAT CCGCTAGGGA 540 

70 

ATCTGGGGTC CCTGATCGCT TCACAGGCAG TGGATCTGGG ACAGATTTCA CTCTCTCCAT 600 

CAGCAGTGTG AAGACTGAAG ACCTGGCAGT TTATTACTGT CAGCAGTATT ATAGCTATCC 660 

75 CCTCACGTTC GGTGCTGGGA CCAAGCTGGT GCTGAAGCTT AGTGCGGACG ATGCGAAAAA 720 

GGATGCTGCG AAGAAGGATG ACGCTAAGAA AGACGATGCT AAAAAGGACC TCGAGGTTCA 780 

GTTGCAGCAG TCTGACGCTG AGTTGGTGAA ACCTGGGGCT TCAGTGAAGA TTTCCTGCAA 840 

20 

GGCTTCTGGC TACACCTTCA CTGACCATGC AATTCACTGG GTGAAACAGA ACCCTGAACA 900 

GGGCCTGGAA TGGATTGGAT ATTTTTCTCC CGGAAATGAT GATTTTAAAT ACAATGAGAG 960 

2s GTTCAAGGGC AAGGCCACAC TGACTGCAGA CAAATCCTCC AGCACTGCCT ACGTGCAGCT 1020 

CAACAGCCTG ACATCTGAGG ATTCTGCAGT GTATTTCTGT ACAAGATCCC TGAATATGGC 1080 

CTACTGGGGT CAAGGAACCT CAGTCACCGT CTCCTCACTA AGCGCAGATG ACGCAAAGAA 1140 

30 

AGACGCAGCT AAAAAAGACG ATGCCAAAAA GGATGACGCC AAGAAAGATC TTGAGGTTCA 1200 

GTTGCAGCAG TCTGACGCTG AGTTGGTGAA ACCTGGGGCT TCAGTGAAGA TTTCCTGCAA 1260 

35 GGCTTCTGGC TACACCTTCA CTGACCATGC AATTCACTGG GTGAAACAGA ACCCTGAACA 1320 

GGGCCTGGAA TGGATTGGAT ATTTTTCTCC CGGAAATGAT GATTTTAAAT ACAATGAGAG 1380 

GTTCAAGGGC AAGGCCACAC TGACTGCAGA CAAATCCTCC AGCACTGCCT ACGTGCAGCT 1440 

40 

CAACAGCCTG ACATCTGAGG ATTCTGCAGT GTATTTCTGT ACAAGATCCC TGAATATGGC 1500 

CTACTGGGGT CAAGGAACCT CAGTCACCGT CTCCTCACTA AGCGCAGATG ACGCAAAGAA 1560 

45 AGACGCAGCT AAAAAAGACG ATGCCAAAAA GGATGACGCC AAGAAAGATC TTGACATTGT 1620 

GATGTCACAG TCTCCATCCT CCCTACCTGT GTCAGTTGGC GAGAAGGTTA CTTTGAGCTG 1680 

CAAGTCCAGT CAGAGCCTTT TATATAGTGG TAATCAAAAG AACTACTTGG CCTGGTACCA 1740 

50 

GCAGAAACCA GGGCAGTCTC CTAAACTGCT GATTTACTGG GCATCCGCTA GGGAATCTGG 1800 

55 
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GGTCCCTGAT 


CGCTTCACAG 


GCAGTGGATC 


TGGGACAGAT 


TTCACTCTCT 


CCATCAGCAG 


1860 


TGTGAAGACT 


GAAGACCTGG 


CAGTTTATTA 


CTGTCAGCAG TATTATAGCT ATCCCCTCAC 


1920 


GTTCGGTGCT 


GGCACCAAGC 


TGGTGCTTAA 


GTAAAAAGCT 


AGCGATGAAT 


CCGTCAAAAC 


I960 


ATCATCTTAC 


AT AAAG TC AC 


TTGGTGATCA 


AGCTCATATC 


ATTGTCCGGC 


AATGGTGTGG 


2040 


GCTTTTTTTG 


TTTTCTATCT 


TTAAAGATCA 


TGTGAAGAAA 


AACGGGAAAA 


TCGGTCTGCG 


2100 


GGAAAGGACC 


GGGTTTTTGT 


CGAAATCATA 


GGCGAATGGG 


TTGGATTGTG 


ACAAAATTCG 


2160 


GATCC 












2165 



(2) INFORMATION FOR SEQ ID NO:9: 
[0126] 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 553 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(ix) FEATURE: 

(A) NAME/KEY: Protein 

(B) LOCATION: 23 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:9: 

Met Lys Tyr Leu Leu Pro Thr Ala Ala Ala Gly Leu Leu Leu Leu Ala 
-20 -15 -10 

Ala Gin Pro Ala Het Ala Asp He Val Met Ser Gin Ser Pro Ser Ser 
-5 15 10 

Leu Pro Val Ser Val Gly Glu Lys Val Thr Leu Ser Cys Lys Ser Ser 
15 20 25 

Gin Ser Leu Leu Tyr Ser Gly Asn Gin Lys Asn Tyr Leu Ala Trp Tyr 
30 35 40 

Gin Gin Lys Pro Gly Gin Ser Pro Lys Leu Leu He Tyr Trp Ala Ser 
45 50 55 

Ala Arg Glu Ser Gly Val Pro Asp Arg Phe Thr Gly Ser Gly Ser Gly 
60 65 70 

Thr Asp Phe Thr Leu Ser He Ser Ser Val Lys Thr Glu Asp Leu Ala 
75 80 85 90 



24 



EP 0 628 078 B1 



Val Tyr Tyr Cys Gin Gin Tyr Tyr Ser Tyr Pro Leu Thr Phe Gly Ala 
95 100 105 

Gly Thr Lys Leu Val Leu Lys Leu Ser Ala Asp Asp Ala Lys Lys Asp 
110 115 120 

Ala Ala Lys Lys Asp Asp Ala Lys Lys Asp Asp Ala Lys Lys Asp Leu 
125 130 135 

Glu Val Gin Leu Gin Gin Ser Asp Ala Glu Leu Val Lys Pro Gly Ala 
140 145 150 

Ser Val Lys He Ser Cys Lys Ala Ser Gly Tyr Thr Phe Thr Asp His 
155 160 165 170 

Ala He His Trp Val Lys Gin Asn Pro Glu Gin Gly Leu Glu Trp He 
175 180 185 

Gly Tyr Phe Ser Pro Gly Asn Asp Asp Phe Lys Tyr Asn Glu Arg Phe 
190 195 200 

Lys Gly Lys Ala Thr Leu Thr Ala Asp Lys Ser Ser Ser Thr Ala Tyr 
205 210 215 

Val Gin Leu Asn Ser Leu Thr Ser Glu Asp Ser Ala Val Tyr Phe Cys 
220 225 230 

Thr Arg Ser Leu Asn Met Ala Tyr Trp Gly Gin Gly Thr Ser Val Thr 
235 240 245 250 

Val Ser Ser Leu Ser Ala Asp Asp Ala Lys Lys Asp Ala Ala Lys Lys 
255 260 265 

Asp Asp Ala Lys Lys Asp Asp Ala Lys Lys Asp Leu Glu Val Gin Leu 
270 275 280 

Gin Gin Ser Asp Ala Glu Leu Val Lys Pro Gly Ala Ser Val Lys He 
285 290 295 

Ser Cys Lys Ala Ser Gly Tyr Thr Phe Thr Asp His Ala He His Trp 
300 305 310 

Val Lys Gin Asn Pro Glu Gin Gly Leu Glu Trp He Gly Tyr Phe Ser 
315 320 325 330 

Pro Gly Asn Asp Asp Phe Lys Tyr Asn Glu Arg Phe Lys Gly Lys Ala 
335 340 345 



Thr Leu Thr Ala Asp Lys Ser Ser Ser Thr Ala Tyr Val Gin Leu Asn 
350 355 360 
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See Leu Thr Ser Glu Asp Sec Ala Val Tyr Phe Cys Thr Arg Ser Leu 
365 370 375 

Asa Met Ala Tyr Trp Gly Gin Gly Thr Ser Val Thr Val Ser Ser Leu 
380 385 390 

Ser Ala Asp Asp Ala Lys Lys Asp Ala Ala Lys Lys Asp Asp Ala Lys 
39S 400 405 410 

Lys Asp Asp Ala Lys Lys Asp Leu Asp lie Val Met Ser Gin Ser Pro 
415 420 425 

Ser Ser Leu Pro Val Ser Val Gly Glu Lys Val Thr Leu Ser Cys Lys 
430 435 440 

Ser Ser Gin Ser Leu Leu Tyr Ser Gly Asn Gin Lys Asn Tyr Leu Ala 
445 450 455 

Trp Tyr Gin Gin Lys Pro Gly Gin Ser Pro Lys Leu Leu lie Tyr Trp 
460 465 470 

Ala Ser Ala Arg Glu Ser Gly Val Pro Asp Arg Phe Thr Gly Ser Gly 
475 480 485 490 

Ser Gly Thr Asp Phe Thr Leu Ser lie Ser Ser Val Lys Thr Glu Asp 
49S 500 505 

Leu Ala Val Tyr Tyr Cys Gin Gin Tyr Tyr Ser Tyr Pro Leu Thr Phe 
510 515 520 

Gly Ala Gly Thr Lys Leu Val Leu Lys 
525 530 



(2) INFORMATION FOR SEQ ID NO:10: 
[0127] 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 25 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:10: 



TAAACTCGAG GTTCAGTTGC AGCAG 



(2) INFORMATION FOR SEQ ID NO:11: 
[0128] 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 44 base pairs 

(B) TYPE: nucleic acid 
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(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:11 : 

5 

TAAAGCTAGC ACCAAGCGCT TAGTGAGGAG ACGGTGACTG AGGT 44 



10 (2) INFORMATION FOR SEQ ID NO:12: 
[0129] 

(i) SEQUENCE CHARACTERISTICS: 

75 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

20 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:12: 

TCGTCCGATT AGGCAAGCTT A 21 

25 

(2) INFORMATION FOR SEQ ID NO: 13: 
[0130] 

30 (j) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
35 (D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:13: 



GATGATTTTA AATACAATGA G 21 



(2) INFORMATION FOR SEQ ID NO: 14: 

45 [0131] 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 98 base pairs 
so (B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:14: 

55 
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TAAATGCGCA GATGACGCAA AGAAAGACGC AGCTAAAAAA GACGATGCCA AAAAGGATGA 
CGCCAAGAAA GATCTTGAGG TTCAGTTGCA GCAGTCTG 

(2) INFORMATION FOR SEQ ID NO:15: 
[0132] 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 21 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:15: 

TGACTTTATG TAAGATGATG T 

(2) INFORMATION FOR SEQ ID NO:16: 
[0133] 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 99 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:16: 

TAAATGCGCA GATGACGCAA AGAAAGACGC AGCTAAAAAA GACGATGCCA AAAAGGATGA 
CGCCAAGAAA GATCTTGACA TTGTGATGTC ACAGTCTCC 

(2) INFORMATION FOR SEQ ID NO: 17: 
[0134] 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 37 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:17: 

TAAAGCTAGC TTTTTACTTA AGCACCAGCT TGGTCCC 
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(2) INFORMATION FOR SEQ ID NO:18: 
[0135] 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 19 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:18: 

CTGCTGGTAC CAGGCCAAG 

(2) INFORMATION FOR SEQ ID NO: 19: 
[0136] 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 46 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID 10:19: 

TAAAGCTAGC ACCAAGCGCT TAGTTTCAGC ACCAGCTTGG TCCCAG 

(2) INFORMATION FOR SEQ ID NO:20: 
[0137] 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 22 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:20: 

TTGATCACCA AGTGACTTTA TG 

(2) INFORMATION FOR SEQ ID NO:21: 
[0138] 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 98 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
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(D) TOPOLOGY: linear 
(xi) SEQUENCE DESCRIPTION: SEQ ID NO:21: 



10 



25 



30 



40 



SO 



55 



TAAGCGCTGA TGATGCTAAG AAGGACGCCG CAAAAAAGGA CGACGCAAAA AAAGATGATG 60 
CAAAAAAGGA TCTGCAGGTT CAGTTGCAGC AGTCTGAC 98 



(2) INFORMATION FOR SEQ ID NO:22: 
[0139] 

15 (j) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 38 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single 
20 (D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:22: 



TTGTGCTAGC TTTTTATGAG GAGACGGTGA CTGAGGTT 38 

(2) INFORMATION FOR SEQ ID NO:23: 
[0140] 

(i) SEQUENCE CHARACTERISTICS: 



(A) LENGTH: 18 base pairs 

(B) TYPE: nucleic acid 

35 (C) STRANDEDNESS: single 

(D) TOPOLOGY: linear 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO:23: 



CAGCAGTATT ATAGCTAT 18 



45 Claims 



1. A multivalent single chain antibody which comprises two or more single chain antibody fragments, each single 
chain antibody fragment specifically binding an antigen, wherein the single chain antibody fragments are covalently 
linked by a first peptide linker which contains an amino acid sequence of 



Leu Ser Ala Asp Asp Ala Lys Lys Asp Ala Ala Lys Lys Asp 
Asp Ala Lys Lys Asp Asp Ala Lys Lys Asp Leu 

and each single chain antibody fragment comprises 
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(a) a first polypeptide comprising a light chain variable domain; 

(b) a second polypeptide comprising a heavy chain variable domain; and 

(c) a second peptide linker linking the first and second polypeptides into a functional binding moiety. 

The mu Itivalent single chain antibody of claim 1 , wherein the light chain variable region and the heavy chain variable 
region are obtained from antibodies against tumor-associated glycoprotein 72 antigen (TAG -72). 

The multivalent single chain antibody of claim 1 , wherein the light chain variable region has an amino acid sequence 
of 



Asr> 


lie 


Val 


Met 


Ser 


Gin 


Ser 


Pro 


Ser 


Ser 


Leu 


Pro 


Val 


Ser 


Val 


Gly 


Glu 


Lvs 


Val 


Thr 


Leu 


Ser 


Cys 


Lys 


Ser 


Ser 


Gin 


Ser 


Leu 


Leu 


Tv 7 


Ser 


Gly 


Asn 


Gin 


Lys 


Asn 


Tyr 


Leu 


Ala 


Trp 


Tvr 


Gin 


Gin 


Lys 


Fro 


Gly 


Gin 


Ser 


Pro 


Lvs 


Leu 


Leu 


lie 


Tv *" 


Trp 


Ala 


Ser 


Ala 


Arc 


Glu 


Ser 


Gly 


Val 


Pro 


Asr> 


Arc 


Phe 


Th r 


Glv 


Ser 


Glv 


Ser 


Gly 


Thr 


ASTD 


Pne 


Thr 


Leu 


Ser 


lie 


Ser 


Ser 


Val 


Lvs 


Thr 


Glu 


Asp 


Leu 


Ala 


Vcl 


Tyr 


Tyr 


Cvc 


Gin 


Gin 


Tvr 


Tyr 


Ser 


Tyr 


Pro 


Leu 


Thi 


Phe 


G3 v 


Als 


Gly 


Thr 


Lys 


Leu 


Val 


Leu 



Lys 

and the heavy chain variable region has an amino acid sequence of 



Glu 


val 


Gin 


Leu 


Gin 


Gin 


Ser 


Asp 


Ala 


Glu 


Leu 


Val 


Lys 


Pro 


Glv 


Ala 


Ser 


Val 


Lys 


lie 


Ser 


Cys 


Lvs 


Ala 


Ser 


Giy 


Ty r 


Thr 


Phe 


Thr 


Asp 


His 


Ala 


lie 


His 


Trp 


Val 


Lys 


Gin 


Asn 


Pro 


Glu 


Gin 


Gly 


Leu 


Glu 


Trp 


lie 


Gly 


Tv r 


Phe 


Ser 


Pro 


Gly 


Asn 


Asp 


Aso 


Phe 


Lys 


Ty r 


Asn 


Glu 


Ar g 


Phe 


Lys 


Gly 


Lys 


Ala 


Thr 


Leu 


Thr 


Ala 


Asp 


Lvs 


Ser 


Ser 


Ser 


Thr 


Ala 


Tyr 


Val 


Gin 


Leu 


Asn 


Ser 


Leu 


Thr 


Ser 


Glu 


Asp 


Ser 


Ala 


Val 


Tyr 


Phe 


Cvs 


Thr 


Arc 


Ser 


Leu 


Asn 


Met 


Ala 


Tyr 


Tr d 


Gly 


Gin 


Giy 


Thr 


Ser 


Val 


Thr 



Val Ser Ser. 

The multivalent single chain antibody of claim 1 , wherein the first peptide linker has an amino acid sequence having 
25 to 30 amino acid residues. 

The multivalent single chain antibody of claim 1, wherein the second peptide linker has an amino acid sequence 
having from 10 to 30 amino acid residues. 

The multivalent single chain antibody of claim 1, wherein the first and second peptide linkers have substantially 
the same amino acid sequence. 

The multivalent single chain antibody of claim 6 wherein the second peptide linker has an amino acid sequence 
identical to that of the first peptide linker. 

A DNA sequence which codes for a multivalent single chain antibody, the multivalent single chain antibody com- 
prising two or more single chain antibody fragments, each fragment having affinity for an antigen, wherein the 
fragments are covalently linked by a first peptide linker which contains an amino acid sequence of 



Leu Ser Ala Asp Asp Ala Lys Lys Asp Ala Ala Lys Lys Asp Asp Ala Lys 
Lys Asp Asp Ala Lys Lys Asp Leu 



and each fragment comprising 
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(a) a first polypeptide comprising a light chain variable domain; 

(b) a second polypeptide comprising a heavy chain variable domain; and 

(c) a second peptide linker linking the first and second polypeptides into a functional binding moiety. 

9. The DNA sequence of claim 8, which codes for a multivalent single chain antibody, wherein the light chain variable 
region and the heavy chain variable region are obtained from antibodies against tumor-associated glycoprotein 
72 antigen (TAG-72). 

10. The DNA sequence of claim 8, wherein the sequence coding for the first polypeptide is substantially homologous 
to the sequence: 

GAC ATT GTG ATG TCA CAG TCT CCA TCC TCC CTA CCT GTG TCA 
GTT GGC GAG AAG GTT ACT TTG AGC TGC AAG TCC AGT CAG 
AGC CTT TTA TAT AGT GGT AAT CAA AAG AAC TAC TTG GCC 
TGG TAC CAG CAG AAA CCA GGG CAG TCT CCT AAA CTG CTG 
ATT TAC TGG GCA TCC GCT AGG GAA TCT GGG GTC CCT GAT 
CGC TTC ACA GGC AGT GGA TCT GGG ACA GAT TTC ACT CTC 
TCC ATC AGC AGT GTG AAG ACT GAA GAC CTG GCA GTT TAT 
TAC TGT CAG CAG TAT TAT AGC TAT CCC CTC ACG TTC GGT GCT 
GGG ACC AAG CTG GTG CTG AAG 

and the first polypeptide retains the characteristic of functional binding to TAG-72 

and wherein the sequence coding for the second polypeptide is substantially homologous to the sequence: 

GAG GTT CAG TTG CAG CAG TCT GAC GCT GAG TTG GTG AAA 
CCT GGG GCT TCA GTG AAG ATT TCC TGC AAG GCT TCT GGC 
TAC ACC TTC ACT GAC CAT GCA ATT CAC TGG GTG AAA CAG 
AAC CCT GAA CAG GGC CTG GAA TGG ATT GGA TAT TTT TCT 
CCC GGA AAT GAT GAT TTT AAA TAC AAT GAG AGG TTC AAG 
GGC AAG GCC ACA CTG ACT GCA GAC AAA TCC TCC AGC ACT 
GCC TAC GTG CAG CTC AAC AGC CTG ACA TCT GAG GAT TCT 
GCA GTG TAT TTC TGT ACA AGA TCC CTG AAT ATG GCC TAC 
TGG GGT CAA GGA ACC TCA GTC ACC GTC TCC TCA. 

and the second polypeptide retains the characteristic of functional binding to TAG-72. 



Patentanspruche 

1. Muftivalenter einzelkettiger Antikorper, welcher zwei oder mehr einzelkettige Antikorperfragmente umfaGt, wobei 
jedes einzelkettige Antikorperfragment an ein Antigen spezifisch bindet, worin die einzelkettigen Antikorperfrag- 
mente durch einen ersten Peptidlinker kovalent verknupft sind, welcher eine Aminosauresequenz von 

Leu Ser Ala Asp Asp Ala Lys Lys Asp Ala Ala Lys Lys Asp 
Asp Ala Lys Lys Asp Asp Ala Lys Lys Asp Leu 

enthalt, und jedes einzelkettige Antikorperfragment 

(a) ein erstes Polypeptid, umfassend eine variable Domane einer leichten Kette; 

(b) ein zweites Polypeptid, umfassend eine variable Domane einer schweren Kette; und 
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(c) einen zweiten Peptidlinker, welcher das erste und das zweite Polypeptid zu einerfunktionellen Bindungs- 
gruppe verknupft, 

umfaGt. 

Multivalenter einzelkettiger Antikorper nach Anspruch 1 , worin die variable Region der leichten Kette und die va- 
riable Region der schweren Kette aus Antikdrpern gegen Tumor-assoziiertes Glykoprotein 72-Antigen (TAG-72) 
erhalten sind. 

Multivalenter einzelkettiger Antikorper nach Anspruch 1, worin die variable Region der leichten Kette eine Amino- 
sauresequenz von 



AST) 


lie 


Val 


Me- 


Ser 


Gin 


Ser 


Val 


Gly 


Glu 


Lys 


Val 


Thr 


Leu 


Leu 


Leu 


Tyr 


Ser' 


Gl v 




Gin 


Gin 


Gin 


Ly s 


Fro 


«y 


Gin 


Ser 


Ala 


Ser 


Ala 


Arg 


Glu 


Ser 


Gly 


Ser 


Cly 


Ser 


Gly 


Thr 


ASD 


Pne 


Lvs 


Thr 


Glu 


Asd 


Leu 


Ala 


Vcl 


Ser 


Tyr 


Pro 


Leu 


Thr 


?he 


G3 v 


Lvs 















Pro Ser Ser Leu Pro Val Ser 

Ser Cys Lys Ser Ser Gin Ser 

Ly s Asn Tyr Leu Ala Trp Tyr 

Pro Lys Leu Leu He Tyr Trp 

Val Pro Asp Arg Phe Thr Gly 

Thr Leu Ser lie Ser Ser Val 

Tvr Tyr Cys Gin Cln Tyr Tyr 

Ala Gly Thr Lys Leu Val Leu 



und die variable Region der schweren Kette eine Aminosauresequenz von 



Glu Val Gin Leu Gin Gin Ser 
Gly Ala Ser Val Lys He Ser 
Phe Tnr Asp His Ala lie Bis 
Gin Gly Leu Glu Trp He Gly 
Asp Phe Lys Tyr Asn Glu Arg 
Thr Ala Asp Lys Ser Ser Ser 
Ser Leu Thr Ser Glu Asp Ser 
Ser Leu Asn Met Ala Tyr Tr d 
Val Ser Ser. 



Asp Ala Glu Leu Val Lys Pro 
Cys Lys Ala Ser Gly Tyr Thr 
Trp Val Lys Gin Asn Pro Glu 
Tyr Phe Ser Pro Gly Asn Asp 
Phe Lys Gly Lys Ala Thr Leu 
Thr Ala Tyr Val Gin Leu Asn 
Ala Val Tyr Phe Cys Thr Arg 
Gly Gin Gly Thr Ser Val Thr 



aufweist. 

Multivalenter einzelkettiger Antikorper nach Anspruch 1 , worin der erste Peptidlinker eine Aminosauresequenz mit 
25 bis 30 Aminosaureresten aufweist. 

Multivalenter einzelkettiger Antikorper nach Anspruch 1, worin der zweite Peptidlinker eine Aminosauresequenz 
mit von 10 bis 30 Aminosaureresten aufweist. 

Multivalenter einzelkettiger Antikorper nach Anspruch 1 , worin der erste und der zweite Peptidlinker im wesentli- 
chen dieselbe Aminosauresequenz aufweisen. 

Multivalenter einzelkettiger Antikorper nach Anspruch 6, worin der zweite Peptidlinker eine Aminosauresequenz 
aufweist, die mit derjenigen des ersten Peptidlinkers identisch ist. 

DNA-Sequenz, welche fur einen multivalenten einzelkettigen Antikorper codiert, wobei der multivalente einzelket- 
tige Antikorper zwei oder mehr einzelkettige Anttkorperfragmente umfaBt, wobei jedes Fragment eine Affinitat fur 
ein Antigen besitzt, worin die Fragmente durch einen ersten Polypeptidlinker kovalent verknupft sind, welcher eine 
Aminosauresequenz von 
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Leu Ser Ala Asp Asp Ala Lys Lys Asp Ala Ala Lys Lys Asp Asp Ala Lys 
Lys Asp Asp Ala Lys Lys Asp Leu 

enthalt, 

und jedes Fragment 

(a) ein erstes Polypeptid, umfassend eine variable Domane einer leichten Kette; 

(b) ein zweites Polypeptid, umfassend eine variable Domane einer schweren Kette; und 

(c) einen zweiten Peptidlinker, welcher das erste und das zweite Polypeptid zu einer funktionellen Bindungs- 
gruppe verknupft, 

umfaGt. 

DNA-Sequenz nach Anspruch 8, welche fur einen multivalenten einzelkettigen Antikorper codiert, worin die varia- 
ble Region der leichten Kette und die variable Region der schweren Kette aus Antikdrpem gegen Tumor-assozi- 
iertes Glykoprotein 72-Antigen (TAG-72) erhalten sind. 

DNA-Sequenz nach Anspruch 8, worin die fur das erste Polypeptid codierende Sequenz im wesentlichen homolog 
zu der Sequenz: 



GAC ATT GTG ATG TCA CAG TCT CCA TCC TCC CTA CCT GTG TCA 
GTT GGC GAG AAG GTT ACT TTG AGC TCC AAG TCC AGT CAG 
AGC CTT TTA TAT AGT GGT AAT CAA AAG AAC TAC TTG GCC 
TGG TAC CAG CAG AAA CCA GGG CAG TCT CCT AAA CTG CTG 
ATT TAC TGG GCA TCC GCT AGG GAA TCT GGG GTC CCT GAT 
CGC TTC ACA GGC AGT GGA TCT GGG ACA GAT TTC ACT CTC 
TCC ATC AGC AGT GTG AAG ACT GAA GAC CTG GCA GTT TAT 
TAC TGT CAG CAG TAT TAT AGC TAT CCC CTC ACG TTC GGT GCT 
GGG ACC AAG CTG GTG CTG AAG 



ist und das erste Polypeptid die Eigenschaft der funktionellen Bindung an TAG-72 beibehalt, und worin die fur das 
zweite Polypeptid codierende Sequenz im wesentlichen homolog zu der Sequenz 

GAG GTT CAG TTG CAG CAG TCT GAC GCT GAG TTG GTG AAA 
CCT GGG GCT TCA GTG AAG ATT TCC TGC AAG GCT TCT GGC 
TAC ACC TTC ACT GAC CAT GCA ATT CAC TGG GTG AAA CAG 
AAC CCT GAA CAG GGC CTG GAA TGG ATT GGA TAT TTT TCT 
CCC GGA AAT GAT GAT TTT AAA TAC AAT GAG AGG TTC AAG 
GGC AAG GCC ACA CTG ACT GCA GAC AAA TCC TCC AGC ACT 
GCC TAC GTG CAG CTC AAC AGC CTG ACA TCT GAG GAT TCT 
GCA GTG TAT TTC TGT ACA AGA TCC CTG AAT ATG GCC TAC 
TGG GGT CAA GGA ACC TCA GTC ACC GTC TCC TCA. 

ist und das zweite Polypeptid die Eigenschaft einer funktionellen Bindung an TAG-72 beibehalt. 
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Revendications 

1. Anticorps monocatenaire multivalent qui comporte deux fragments monocatenaires d'anticorps ou plus, chaque 
fragment monocatenaire d'anticorps fixant specifiquement un antigene, dans lequel anticorps les fragments mo- 
nocatenaires d'anticorps sont relies de maniere covalente par un premier raccord peptidique qui conttent la se- 
quence d'acides amines suivante : 

Leu Ser Ala Asp Asp Ala Lys Lys Asp Ala Ala Lys Lys Asp 
Asp Ala Lys Lys Asp Asp Ala Lys Lys Asp Leu 

et chaque fragment monocatenaire d'anticorps comporte 

a) un premier polypeptide constituant un domaine variable de chaTne legere ; 

b) un deuxieme polypeptide constituant un domaine variable de chaine lourde ; et 

c) un deuxieme raccord peptidique reliant ces premier et deuxieme polypeptides en une entity fixatrice fonc- 
tionnelle. 

2. Anticorps monocatenaire multivalent, conforme a la revendication 1 , dans lequel le domaine variable de chaine 
legere et le domaine variable de chaine lourde ont ete obtenus a partir d'anticorps diriges centre I'antigene qu'est 
la glycoproteine 72 associee a des tumeurs (antigene TAG-72). 

3. Anticorps monocatenaire multivalent, conforme a la revendication 1 , dans lequel le domaine variable de chaTne 
legere presente la sequence suivante d'acides amines : 



ASD 


Ile 


Val 


Met 


Ser 


Gin 


Ser 


Pro 


Ser 


Ser 


Leu 


Pro 


Val 


Ser 


Val 


Gly 


Glu 


Lys 


Val 


Thr 


Leu 


Ser 


Cys 


Lys 


Ser 


Ser 


Gin 


Ser 


Leu 


Leu 


Tyr 


Ser' 


Glv 


Asn 


Gin 


Lys 


Asn 


Tyr 


Leu 


Ala 


Tr d Tv r 


Gin 


Gin 


Lys 


Fro 


Civ 


Gin 


Ser 


Pro 


Lys 


Leu 


Leu 


lie 


Tyr 


T:d 


Ala 


Ser 


Ale 


Arg 


Glu 


Ser 


Gly 


Val 


Pro 


Aso Arg 


Phe 


Thr 


Glv 


Ser 


Gly 


Ser 


Gly 


Thr 


ASD 


Phe 


Thr 


Leu 


Ser 


lie 


Ser 


Ser 


val 


Lvs 


Thr 


Glu 


Asp 


Leu- 


A Is 


Vel 


Tv r 


Tyr 


Cy s 
Thr 


Gin 


Gin 


Tyr 


T v r 


s'er 




Dtq 


Leu 


Thr 


?he 


Glv 


Ala 


Gly 


Lys 


Leu 


Val 


Leu 



Lys 

et le domaine variable de chaine lourde presente la sequence suivante d'acides amines : 



Glu 


Val 


Gin 


Leu 


Gin 


Gin 


Ser 


Asp Ala 


Glu 


Leu 


Val 


Lys 


Pro 


Glv 


Ala 


Ser 


val 


Lys 


lie 


Ser 


Cys 


Lys 


Ala 


Ser 


Gly 


Ty r 


Thr 


Phe 


Thr 


Asp 


Bis 


Ala 


lie 


2is 


Trp 


Val 


Lys 


Gin 


Asn 


Pro 


Glu 


Gin 


Gly 


Leu 


Glu 


Trp 


He 


Gly 


Tvr 


Phe 


Ser 


Pro 


Gly 


Asn 


Asp 


AST3 


Phe 


Lys 


Ty r 


Asn 


Glu 


Arg 


Phe 


Lys 


Gly 


Lys 


Ala 


Thr 


Leu 


Thr 


Ala 


Aep 


Lys 


Ser 


Ser 


Ser 


Thr 


Ala 


Tv r 


Val 


Gin 


Leu 


Asn 


ser 


Leu 


Thr 


Ser 


Glu 


Asp 


Ser 


Ala 


Val 


Tvr 


Phe 


Cys 


Thr 


Arg 


Ser 


Leu 


Asn 


Met 


Ala 


Tyr 


Trp 


Gly 


Gin 


Gly 


Thr 


Ser 


Val 


Thr 



Val Ser Ser. 

4. Anticorps monocatenaire multivalent, conforme a la revendication 1 , dans lequel la sequence d'acides amines du 
premier raccord peptidique comporte de 25 a 30 residus d'acides amines. 

5. Anticorps monocatenaire multivalent, conforme a la revendication 1 , dans lequel la sequence d'acides amines du 
deuxieme raccord peptidique comporte de 10 a 30 residus d'acides amines. 

6. Anticorps monocatenaire multivalent, conforme a la revendication 1 , dans lequel les premier et deuxieme raccords 
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peptidiques ont pratiquement la meme sequence d'acides amines. 

7. Anticorps monocatenaire multivalent, conforme a la revendication 1 , dans lequel la sequence d'acides amines du 
deuxieme raccord peptidique est identique a celle du premier raccord peptidique. 

8. Sequence d'ADN qui code un anticorps monocatenaire multivalent, lequel anticorps monocatenaire multivalent 
comporte deux fragments monocatenaires d'anticorps ou plus, chacun de ces fragments presentant une certaine 
affinite pour un antigene, et ces fragments etant relies de maniere covalente par un premier raccord peptidique 
qui contient la sequence d'acides amines suivante : 

Leu Ser Ala Asp Asp Ala Lys Lys Asp Ala Ala Lys Lys Asp 
Asp Ala Lys Lys Asp Asp Ala Lys Lys Asp Leu 

et chaque fragment comportant 

a) un premier polypeptide constituant un domaine variable de chaine legere ; 

b) un deuxieme polypeptide constituant un domaine variable de chaine lourde ; et 

c) un deuxieme raccord peptidique reliant ces premier et deuxieme polypeptides en une entite fixatrice fonc- 
tionnelle. 

9. Sequence d'ADN conforme a la revendication 8, qui code un anticorps monocatenaires multivalent dans lequel le 
domaine variable de chaine legere et le domaine variable de chaine lourde ont ete obtenus a partir d'anticorps 
diriges contre I'antigene qu'est la glycoproteine 72 associee a des tumeurs (antigene TAG-72). 

10. Sequence d'ADN conforme a la revendication 8, dans laquelle la sequence codant le premier polypeptide est 
pratiquement homologue de la sequence suivante : 

GAC ATT GTG ATG TCA CAG TCT CCA TCC TCC CTA CCT GTG TCA 
GTT GGC GAG AAG GTT ACT TTG AGC TGC AAG TCC AGT CAG 
AGC CTT TTA TAT AGT GGT AAT CAA AAG AAC TAC TTG GCC 
TGG TAC CAG CAG AAA CCA GGG CAG TCT CCT AAA CTG CTG 
ATT TAC TGG GCA TCC GCT AGG GAA TCT GGG GTC CCT GAT * 
CGC TTC ACA GGC AGT GGA TCT GGG ACA GAT TTC ACT CTC 
TCC ATC AGC AGT GTG AAG ACT GAA GAC CTG GCA GTT TAT 
TAC TGT CAG CAG TAT TAT AGC TAT CCC CTC ACG TTC GGT GCT 
GGG ACC AAG CTG GTG CTG AAG 

le premier polypeptide conservant la propri6t6 de liaison fonctionnelle a I'antigene TAG-72, 

et la sequence codant le deuxieme polypeptide est pratiquement homologue de la sequence suivante : 



GAG GTT CAG TTG CAG CAG TCT GAC GCT GAG TTG GTG AAA 
CCT GGG GCT TCA GTG AAG ATT TCC TGC AAG GCT TCT GGC 
TAC ACC TTC ACT GAC CAT GCA ATT CAC TGG GTG AAA CAG 
AAC CCT GAA CAG GGC CTG GAA TGG ATT GGA TAT TTT TCT 
CCC GGA AAT GAT GAT TTT AAA TAC AAT GAG AGG TTC AAG 
GGC AAG GCC ACA CTG ACT GCA GAC AAA TCC TCC AGC ACT 
GCC TAC GTG CAG CTC AAC AGC CTG ACA TCT GAG GAT TCT 
GCA GTG TAT TTC TGT ACA AGA TCC CTG AAT ATG GCC TAC 
TGG GGT CAA GGA ACC TCA GTC ACC GTC TCC TCA. 
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le deuxieme polypeptide conservant la propriete de liaison fonctionnelle a Pantigene TAG -72. 
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FIG. 4 



GAG 


GTT 


CAG 


TTC 


CAG 


CAG 


TCT 


GAC 


GCT 


GAG 


TTC 


GTG 


AAA 


CCT 


GGG 


GCT 


TCA 


GTG 


AAG 


ATT 


TCC 


TGC 


AAG 


GCT 


TCT 


GGC 


TAC 


ACC 


TTC 


ACT 


GAC 


CAT 


GCA 


ATT 


CAC 


TGG 


GTG 


AAA 


CAG 


AAC 


CCT 


GAA 


CAG 


GGC 


CTG 


GAA 


TGG 


ATT 


GGA 


TAT 


TTT 


TCT 


CCC 


GGA 


AAT 


GAT 


GAT 


TTT 


AAA 


TAC 


AAT 


GAG 


AGG 


TTC 


AAG 


GGC 


AAG 


GCC 


ACA 


CTG 


ACT 


GCA 


GAC 


AAA 


TCC 


TCC 


AGC 


ACT 


GCC 


TAC 


GTG 


CAG 


CTC 


AAC 


AGC 


CTG 


ACA 


TCT 


GAG 


GAT 


TCT 


GCA 


GTG 


TAT 


TTC 


TCT 


ACA 


AGA 


TCC 


CTG 


AAT 


ATG 


GCC 


TAC 


TGG 


GGT 


CAA 


GGA 


ACC 


TCA 


GTC 


ACC 


GTC 


TCC 


TCA 

























FIG. 5 
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